pairtools v1.0.0 roadmap

There has been a big change in pairtools functions since last release (April 2019!). 
With recent dedup and parse updates which add functionality it is important to document changes and release them as the new version. 

Note: this is the header post connecting multiple issues, feel free to update and improve!

PR with updates: https://github.com/open2c/pairtools/pull/117 

#### Post merge:

- [x] sphinx docs update with incorporated walkthroughs

#### Fixes by modules:

pairtools dedup
- [x] finalize detection of optical duplicates https://github.com/open2c/pairtools/issues/106 and https://github.com/open2c/pairtools/issues/59, also related to  https://github.com/open2c/pairtools/issues/54 
- [x] chunked dedup by @Phlya 
- [x] improvement of dedup to include reporting of the parent readID by @Phlya and @agalitsyna

pairtools stats/scaling
- [x] split dedup stats and regular stats
- [x] output chromosome size to the stats output https://github.com/open2c/pairtools/issues/83 
- [x] pairtools stats: YAML output? https://github.com/open2c/pairtools/issues/111  and https://github.com/open2c/pairtools/issues/79
- [x] pairtools scaling tool which takes into account chromosome sizes: https://github.com/open2c/pairtools/issues/81,  https://github.com/open2c/pairtools/issues/56? 

pairtools parse
- [x] parse complex walks engine and tools: https://github.com/open2c/pairtools/pull/109
- [x] stdin and stdout reporting defaults: https://github.com/open2c/pairtools/issues/48 
- [x] flipping issue: https://github.com/open2c/pairtools/issues/91 

pairtools phase
- [x] make work with both pip and github versions of bwa: https://github.com/open2c/pairtools/pull/114

pairtools restrict
- [x] Handle empty pairs with "!" chromosomes: https://github.com/open2c/pairtools/issues/76 
- [x] Problem with restriction sites header/first rfrag: https://github.com/open2c/pairtools/issues/73 
- [x] Suggestions by @golobor: https://github.com/open2c/pairtools/issues/16

pairtools merge
- [x] do not require sorting? https://github.com/open2c/pairtools/issues/23 
- [x] headers handling: https://github.com/open2c/pairtools/issues/18

#### General improvements:

Headers maintenance
- [x] allow adding a header to a headerless file https://github.com/open2c/pairtools/issues/119
or broader addition of the headed module, draft: https://github.com/open2c/pairtools/pull/121 

Code maintenance
- [x] transfer pairlib into sandbox of pairtools lib
- [x] separate cli and lib
- [x] Remove OrderedDict: https://github.com/open2c/pairtools/issues/113 
- [x] Clean up deprecation warnings, e.g. https://github.com/open2c/pairtools/issues/71
- [x] Fix input errors without explanations, e.g. https://github.com/open2c/pairtools/issues/61 

#### Specific proposals: 

Docs improvements
- [x] pairtools walkthrough
- [x] phasing walkthrough
- [x] parse docs update

Tests proposals
- [x] add tests for dedup @Phlya : https://github.com/open2c/pairtools/issues/5
- [x] add tests for stats, and merge: https://github.com/open2c/pairtools/issues/5

Enhancements
- [x] add summaries: https://github.com/open2c/pairtools/pull/105 
- [x] support of [bwa mem2]( https://github.com/bwa-mem2/bwa-mem2), which is 2-3 times faster than usual bwa mem: https://github.com/open2c/pairtools/discussions/118
- [x] I/O single utility instead of repetitive code in each module

#### Post-release
- let the user to define the rule of "best representative" in each cluster, in particular, by MAPQ? https://github.com/open2c/pairtools/pull/95

#### Declined for this release
- bam annotation? https://github.com/open2c/pairtools/issues/67 
- report mapq in the stats: https://github.com/open2c/pairtools/issues/80 (or extend to any specified additional fields?) 
- support Python 3.10: not possible due to conda problem with glibc
- single-cell walkthrough: too detailed
- more extended description of pair types standards, maybe a walkthrough (see question: https://github.com/open2c/pairtools/issues/112, also https://github.com/open2c/pairtools/issues/68, https://github.com/open2c/pairtools/issues/104)
- Add tests for compression-decompression: https://github.com/open2c/pairtools/issues/51 
- Add tests for example_pipeline @golobor : https://github.com/open2c/pairtools/issues/35

#### Resolved with no implementation
- duplicate the data processing history (currently stored in @PG fields) in #command fields of the .pairs header:, declined for now: https://github.com/open2c/pairtools/issues/70 
- suggestion to set the default temporary folder to ./ instead of $TMPDIR, declined for now https://github.com/open2c/pairtools/issues/84 
- sort is parallel, but someone reported that it is not for their case, no reproducible example: https://github.com/open2c/pairtools/issues/72 
- pairtools subsampling is present, not clear what might be the modifications: https://github.com/open2c/pairtools/issues/66 
- unified way of changing the separator, not clear why it's needed and what are the use cases: https://github.com/open2c/pairtools/issues/50


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pairtools v1.0.0 roadmap #116

Post merge:

Fixes by modules:

General improvements:

Specific proposals:

Post-release

Declined for this release

Resolved with no implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

pairtools v1.0.0 roadmap #116

Description

Post merge:

Fixes by modules:

General improvements:

Specific proposals:

Post-release

Declined for this release

Resolved with no implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions