close
Skip to content

Added option to read regions from a text file.#840

Merged
daviesrob merged 1 commit into
samtools:developfrom
whitwham:faidx_region_list
May 15, 2018
Merged

Added option to read regions from a text file.#840
daviesrob merged 1 commit into
samtools:developfrom
whitwham:faidx_region_list

Conversation

@whitwham
Copy link
Copy Markdown
Member

@whitwham whitwham commented May 1, 2018

This commit also fixes a bug in writing output to file. The tests have been updated accordingly. Closes #753 and is an alternative solution to the one provided in #754.

Comment thread faidx.c Outdated
" -o, --output FILE Write FASTA to file.\n"
" -n, --length INT Length of FASTA sequence line. [60]\n"
" -c, --continue Continue after trying to retrieve missing region.\n"
" -r, --region_file FILE File of regions. One per line.\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be stated that a line in the region file has the format reg:from-to (chr1:12209228-12209246) and it is 1-based, so to be clear that is not a BED file or any other TAB delimited format.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason it couldn't additionally accept the regions as a BED file? This would be useful and in line with other samtools commands (though starting to duplicate the already-existing bedtools getfasta).

While you're editing the usage… wouldn't putting FILE/INT/etc beside their options be more readable?

" -o, --output FILE       Write FASTA to file\n"
…etc…
" -r, --region_file FILE  File of regions.  One per line\n"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, also usual would be --region-file with hyphen rather than underscore.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting BED files as well could be useful, although if supported it should have its own option. I see both -b (used by samtools depth) and -L (samtools view) are available. This might count as future work, though. We'd have to look into whether the interface in bedidx.h works in the way we would want it to. This PR is simply to get around the command line limitations on how many regions you can ask for, and works pretty much in the way you would expect for that purpose.

The FILE/INT positioning was presumably so they all lined up in the same column. It probably made more sense when there were only short options. And yes, it should be hyphens not underscores to match the other long options.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regions have their own section in the man page and neither the current faidx usage or man entry defines them. I will add a little bit more to the usage but not a full definition. BED files are a different issue and if anyone wants to raise one then I will look at it. Yes it should be --region-file.

@daviesrob daviesrob force-pushed the faidx_region_list branch from 7ac7ce5 to 466752b Compare May 15, 2018 14:05
This commit also fixes a bug when writing output to file - the
sequence names were written to stdout instead of the output file.
The tests have been updated accordingly.
@daviesrob daviesrob force-pushed the faidx_region_list branch from 466752b to 68b1e87 Compare May 15, 2018 14:11
@daviesrob daviesrob merged commit 68b1e87 into samtools:develop May 15, 2018
@whitwham whitwham mentioned this pull request May 16, 2018
@whitwham whitwham deleted the faidx_region_list branch February 14, 2022 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

faidx - argument list is too long.

4 participants