Conversation
|
still some to do, but unfurling and the existing header code could use a lookover now |
| return ht | ||
|
|
||
|
|
||
| def _v4_false_dup_unfurl_annotations( |
There was a problem hiding this comment.
is this an unused function that needs to be deleted?
| "--overwrite", | ||
| help="Option to overwrite existing custom liftover table.", | ||
| action="store_true", | ||
| ) | ||
| parser.add_argument( | ||
| "--test", |
There was a problem hiding this comment.
neither overwrite or test is referenced anywhere in the script
| logger = logging.getLogger("false_dup_genes") | ||
| logger.setLevel(logging.INFO) | ||
|
|
||
| FALSE_DUP_GENES = ["KCNE1", "CBS", "CRYAA"] |
There was a problem hiding this comment.
This constant is already defined in create_false_dup_liftover.py. Constants that already exist should be imported rather than redefined. However, I feel like all the false dup code can just be combined into one script, with arguments that can be supplied to either create the Table or export the VCF.
| :param ht: Release Hail Table | ||
| :param vcf_info_reorder: Order of VCF INFO fields | ||
| :return: Hail Table prepared for validity checks and export | ||
| """ |
There was a problem hiding this comment.
| :param ht: Release Hail Table | |
| :param vcf_info_reorder: Order of VCF INFO fields | |
| :return: Hail Table prepared for validity checks and export | |
| """ | |
| :param ht: Release Hail Table of false dup genes. | |
| :param vcf_info_reorder: Order of VCF INFO fields. | |
| :return: Hail Table prepared for validity checks and export. | |
| """ |
| :return: Hail Table prepared for validity checks and export | ||
| """ | ||
| logger.info( | ||
| "Unfurling nested gnomAD frequency annotations and add to INFO field..." |
There was a problem hiding this comment.
| "Unfurling nested gnomAD frequency annotations and add to INFO field..." | |
| "Unfurling nested gnomAD frequency annotations and adding to INFO field..." |
| return vcf_info_dict | ||
|
|
||
|
|
||
| def _joint_filters(ht: hl.Table) -> hl.Table: |
There was a problem hiding this comment.
_joint_filters -> prepare_joint_filters
| variant_qc_filter="RF", | ||
| ) | ||
|
|
||
| custom_filter_dict = { |
There was a problem hiding this comment.
see previous note about considering missing to be PASS
| } | ||
|
|
||
|
|
||
| def populate_subset_info_dict( |
| return vcf_info_dict | ||
|
|
||
|
|
||
| def populate_info_dict( |
| def main(args): | ||
| ht = hl.read_table(get_false_dup_genes_path(release_version="4.0")) | ||
| ht = prepare_false_dup_ht_for_validation(ht) | ||
| header_dict = { |
There was a problem hiding this comment.
where does header_dict get used? why call prepare_vcf_filter_header twice?
STILL NEED: rest of populated headers for info fields and subsets
Spaghetti Code for FalseDup
Code to take False Duplication (of three chr21 genes) Hail Table and convert it to a VCF, verify, and export.
03-07-24: still needs verification and header added, but just wanted to get the PR opened