Fusion Junctions Format

1. chrom: the two chromosomes names involved in the junction

2. donerEnd: The end of doner site of splicing on chromosome

3. acceptorStart: The start of acceptor site of splicing on chromosome

4. name: The name of junction

5. coverage: number of reads aligned to the junction

6. strand: strand of the reads mapped to the two chromosomes

7. itemRgb: An RGB value of the form R,G,B

8. blockCount :  The number of blocks (exons) in the BED line

9. blockSizes - A comma-separated list of the block sizes. 

10. blockStarts - A comma-separated list of block starts.

11. entropy- entropy of the junction.

12. flank string case- non-zero for canonical and semi-canonical junctions

ATAC 1
GTAT 2
CTGC 3
GCAG 4
GTAG 5
CTAC 6
others 0

13. flank string - the two basepairs after doner site combined the two basepairs before acceptor site

14. min mismatch- Minimal mismatch of read mapped to the junction

15. max mismatch- Maximal mismatch of read mapped to the junction

16. average mismatch - Average mismatch of all reads mapped to the junction

17. maximal of minimal doner site length:  if doner site is shorter than acceptor site, and if the doner site is longer than current maximal doner site length, then update current maximal doner site length

18. maximal of minimal acceptor site length: if acceptor site is shorter than doner site, and if the doner site is longer than current maximal acceptor site length, then update current maximal acceptor site length

19. minimal anchor difference - Minimal difference between doner site and acceptor site

20. unique read count - Number of uniquely mapped reads mapped to the fusion

21. multiple read count - Number of multiply mapped reads mapped to the fusion

22. paired reads count - Number of reads mapped to fusion and can be paired with their mates near the fusion

23. left paired reads count - Number of paired reads that the read itself is mapped to the left of its mate on genome

24. right paired reads count - Number of paired reads that the read itself is mapped to the right of its mate on genome

25. multiple paired reads count - Number of multiply mapped reads mapped to the fusion and are paired with their mates

26. unique paired reads count - Number of uniquely mapped reads mapped to the fusion and are paired with their mates

27. single reads count - Number of reads mapped to the fusion but can't be paired with their mates

28. encompassing read pairs count - Number of reads pairs surround the fusion(but not cross the fusion)

29. donerStart: The start of doner site of splicing on chromosome

30. acceptorEnd: The end of acceptor site of splicing on chromosome

31. doner isoform structures: The isoform(transcript) structure on the doner site. each isoform structure is separated by '|'. The format of each isoform structure is the "start_of_the_isoform,CIGAR_string_of_structure".

E.g. 59445681,180M12006N66M8046N47M|59445681,180M20118N47M|

Two isoforms start at 59445681

32. acceptor isoform structures: The isoform(transcript) structure on the acceptor site. 

33. doner uniformity score(disabled) - The p-value of T-test against the hypothesis that the start of spanning read pairs and encompassing read pairs distribute uniformly on the doner site

34. acceptor uniformity score(disabled) - The p-value of Kolmogorov-Smirnov test against the hypothesis that the end of spanning read pairs and encompassing read pairs distribute uniformly on the acceptor site

35. doner uniformity KS-test score (disabled) -  The score of Kolmogorov-Smirnov test against the hypothesis that the start of spanning read pairs and encompassing read pairs distribute uniformly on the doner site

36. acceptor uniformity KS-test score (disabled)  - The score of Kolmogorov-Smirnov test against the hypothesis that the end of spanning read pairs and encompassing read pairs distribute uniformly on the acceptor site

37. minimal doner isoform length - Minimal length of isoform structure on the doner site

38. maximal doner isoform length - Maximal length of isoform structure on the doner site

39. minimal acceptor isoform length - Minimal length of isoform structure on the acceptor site

40. maximal acceptor isoform length - Maximal length of isoform structure on the acceptor site

41. doner site match to normal junction - If the fusion doner site is matched to a normal splice junction. 1 is matched, 0 is not matched

42. acceptor site match to normal junction - If the fusion doner site is matched to a normal splice junction. 1 is matched, 0 is not matched

43. doner sequence - The 25bp sequence at doner site matched to the fusion reads

if doner strand is +, it is chrom1[donerEnd-25:donerEnd]

if doner strand  is -, it is revcomp(chrom1[donerEnd:donerEnd+25]))

44. doner sequence - The 25bp sequence at acceptor site matched to the fusion reads

if acceptor strand is +, it is chrom2[acceptorStart:acceptorStart+25]

if acceptor strand is -, it is revcomp(chrom2[acceptorStart-25:acceptorStart])

45. match to gene strand - If the fusion strand matched with the annotated gene strand. 1 is matched, 0 is not matched 

46. fusion source- The source of fusion

from_fusion: The fusion is from fusion alignments

from_normal: The fusion is from normal alignments, which is normal junction cross two genes(read through fusions) 

47. fusion type - The type of fusion based on the annotated gene

fusion: The start and end of the fusion is annotated to two distinct genes

normal: The start and end of the fusion is annotated to same gene(Circular RNAs)

intergenic: Either the start or end has no gene annotated

overlapped?

48. gene strand - The annotated genes strands, if there are

49. doner annotated gene - The name of the gene annotated to the doner site of the fusion

50. acceptor annotated gene - The name of the gene annotated to the acceptor site of the fusion


Example:

chr20~chr17 49411710 59445688 JUNC_1215 354 ++ 255,0,0 2 94,94,201,174, 0,18446744073699517733, 3.886652 6 GTAG 0 4 1.596045 50 50 0 354 0 208 156 52 0 208 146 282 49411616 59445782 59445681,180M12006N66M8046N47M|59445681,180M20118N47M|59445681,180M23477N74M|59445681,180M24387N8M| 49411511,205M| 0 0 0.304892 0.182286 293 188 205 205 doner_exact_matched acceptor_exact_matched CCTGACCCCCGAGCCTGGGGCCGAG AGGGTCACGCTCCTGTCAAAGGTAC 1 from_fusion fusion +,+ BCAS4, BCAS3,