削除符号論文メモ:2022/12/06

タイトル:A Model for Optimal Assignment of Non-Uniquely Mapped NGS Reads in DNA Regions of Duplications or Deletions
著者名:Rituparna Sinha; Rajat K. Pal; Rajat K. De
雑誌名:2022 2nd International Conference on Intelligent Technologies (CONIT)
号数:-
DOI:10.1109/CONIT55038.2022.9848131

Abstract:Massively parallel sequencers have enabled genome sequences to be available at a very low cost and price, which opened huge scope on analyzing human genome sequences from different perspectives, thereby the association of diseases with genetic alterations gets further enlightened. However, the sequencing process and alignment of NGS technology based short reads suffer from various sequencing biases which needs to be addressed. In this work, the mappability bias occurring with respect to repeat rich regions of the DNA have been addressed in a novel approach. A model has been designed which considers all non-uniquely mapped reads and performs a pipeline of computations to  allocate the reads to an optimal location, due to which the precise detection of 
breakpoints in the region of duplications and deletions are obtained. In addition, the application of this model for mappability bias correction, prior to the detection of structurally altered regions of the genome, leads to a better sensitivity value.
タイトル:Deletions and Insertions of the Symbol “0” and Asymmetric/
Unidirectional Error Control Codes for the L1 Metric
著者名:Luca G. Tallini; Nawaf Alqwaifly; Bella Bose
雑誌名:IEEE Transactions on Information Theory
号数:-
DOI:10.1109/TIT.2022.3203594

Abstract:This paper gives some theory and efficient design of binary 
block codes capable of controlling the deletions of the symbol “0” (
referred to as 0-deletions) and/or the insertions of the symbol “0” (
referred to as 0-insertions). This problem of controlling 0-deletions 
and/or 0-insertions (referred to as 0-errors) is shown to be equivalent 
to the efficient design of L 1 metric asymmetric error control codes 
over the natural alphabet, IIN. In this way, it is shown that the t 0-
insertion correcting codes are actually capable of controlling much more;
 namely, they can correct t 0-errors, detect ( t + 1) 0-errors and, 
simultaneously, detect all occurrences of only 0-deletions or only 0-
insertions in every received word (briefly, they are t -Symmetric 0-
Error Correcting/( t +1)-Symmetric 0-Error Detecting/All Unidirectional 
0-Error Detecting ( t -Sy0EC/( t + 1)-Sy0ED/AU0ED) codes). From the 
relations with the L 1 distance error control codes, new improved bounds 
are given for the optimal t 0-error correcting codes. Optimal non-
systematic code designs are given. Decoding can be efficiently performed 
by algebraic means using the Extended Euclidean Algorithm (EEA).
タイトル:Bounds and Constructions for Insertion and Deletion Codes
著者名:Shu Liu; Chaoping Xing
雑誌名:IEEE Transactions on Information Theory
号数:-
DOI:10.1109/TIT.2022.3199503

Abstract:Insertion and deletion (insdel for short) codes have recently attracted a lot of attention due to their applications in many interesting fields such as DNA storage, DNA analysis, race-track memory error correction and language processing. The present paper mainly studies limits and constructions of insdel codes. The paper can be divided into two parts. The first part focuses on various bounds, while the second part concentrates on constructions of insdel codes. Although the insdel-metric Singleton bound has been derived before, it is still unknown if there are any nontrivial codes achieving this bound. Our first result shows that any nontrivial insdel codes do not achieve the insdel-metric Singleton bound. The second bound shows that every [n, k] Reed-Solomon code has insdel distance upper bounded by 2n - 4k + 4 and it is known in literature that an [n, k] Reed-Solomon code can have insdel distance 2n - 4k+4 as long as the field size is sufficiently large. The third bound shows a trade-off between insdel distance and code alphabet size for codes achieving the Hamming-metric Singleton bound. In the second part of the paper, we first provide a non-explicit construction of nonlinear codes that can approach the insdel-metric Singleton bound arbitrarily when the code alphabet size is sufficiently large. The second construction gives two-dimensional Reed-Solomon codes of length n and insdel distance 2n - 4 with field size q = O(n5). The non-explicit construction of insdel codes is based on constant-weight L1-codes that are introduced in this paper. We first establish a relation between constant-weight L1-codes and insdel codes. Based on this relation, we construct constant-weight L1-codes with reasonable parameters and subsequently give insdel codes approaching the insdel-metric Singleton bound. Via automorphism group of rational function field, we provide a necessary and sufficient condition under which a two
dimensional Reed-Solomon code of length n has insdel distance 2n - 4. 
Based on this criterion, we present a construction of q-ary two-dimensional Reed-Solomon codes of length n and insdel distance 2n - 4 with q = O(n5). Though this is worse than the current best field size, 
we provide a new angle to look into the problem.

 

タイトル:Beyond Single-Deletion Correcting Codes: Substitutions and 
Transpositions
著者名:Ryan Gabrys; Venkatesan Guruswami; João Ribeiro; Ke Wu
雑誌名:IEEE Transactions on Information Theory
号数:-
DOI:10.1109/TIT.2022.3202856

Abstract:We consider the problem of designing low-redundancy codes in settings where one must correct deletions in conjunction with substitutions or adjacent transpositions; a combination of errors that is usually observed in DNA-based data storage. One of the most basic versions of this problem was settled more than 50 years ago by Levenshtein, who proved that binary Varshamov-Tenengolts codes correct one arbitrary edit error, i.e., one deletion or one substitution, with nearly optimal redundancy. However, this approach fails to extend to many simple and natural variations of the binary single-edit error setting. In this work, we make progress on the code design problem above in three such variations: • We construct linear-time encodable and decodable length- n non-binary codes correcting a single edit error with nearly optimal redundancy log n + O (log log n ), providing an alternative simpler proof of a result by Cai, Chee, Gabrys, Kiah, and Nguyen (IEEE Trans. Inf. Theory 2021). This is achieved by employing what we call weighted VT sketches , a new notion that may be of independent interest. • We show the existence of a binary code correcting one deletion or one adjacent transposition with nearly optimal redundancy log n + O (log log n ). • We construct linear-time encodable and list-decodable binary codes with list-size 2 for one deletion and one substitution with redundancy 4 log n + O (log log n ). 
This matches the Gilbert-Varshamov existential bound up to an O (log log n ) additive term.

 


コロラド大学訪問

お金が無い!!!
ドル高 円安のために、金銭的にピンチに陥っています。

アメリカでの研究滞在という貴重な機会。
本来なら、ハワイだけでなく、アメリカ本土を巡って研究討論したり学会参加したりするつもりだったのですが。。。

研究予算の使途が、保険料+αにしかならない状況です。
(ハワイ大学の規程で、保険加入が義務付けられています。
 訪問研究者は、一般より安い学生向けのプランに加入できますが、家族で1か月2000ドル。
 1ドル150円の時は、保険料だけで一か月30万円!!)

落ち込んでいる僕に、コロラド大学の友人から救いの手が。
 「コロラド大で講演する?
  航空券代+宿泊代を出すよ」

なんて優しいオファー!!!
二つ返事で引き受けました。

デンバー国際空港のレストラン Mesa Verde

コロラドと言えば「Mesa Verde」。
世界で最初に世界遺産に選ばれた岩のマンション。

今回利用したデンバー空港と、訪問先のコロラド大学ボールダーキャンパスとからは、Mesa Verdeまでちょっと遠い。
世界遺産観光は断念。

空港のレストランで同名の Mesa Verde を満喫。

 

Mesa Verdeのハンバーガー。肉が美味い。

アメリカと言えば、トルネード(竜巻)。
空港と言えど、自然に対しては安全ではありません。

トイレが竜巻除けのシェルター構造になってます。

トイレの看板に トルネードシェルター の文字が!!

ボールダーはデンバーから北へ少し行ったところ。
空港からバスで1時間くらい。

街並みはレンガ系の建物が多く、
遠くに大きな山がそびえ立ちます。

 

訪問したのは11月1日前後。
秋でした。

ハワイでは、紅葉とか忘れがち。

コロラド大学のキャンパスもレンガ創り。
綺麗です。

学内にプラネタリウムが設置されてます。

立派な大学です。

コロラド大学:数学科の建物
数学科の建物内

友人の Richard Green と再会。

2018年と2020年に会っています。
2020年はコロナが流行する少し前。絶妙なタイミングでした。

 

Richardと私

Richardと僕が知り合ったのは、2002年のこと。
僕と彼の研究テーマが近かったから。

Weyl群のMinuscule表現に関わる組み合わせ構造
というテーマ。

たまたま彼の研究を知って、
Eメールで数回連絡を取り合いました。

 

それから約15年が経ったころ、
Richardが数学書を出版。
タイトルは「Combinatorics of Minuscule Representations」。

早速購入して読んでみると、
僕の名前が書かれていました。

それをきっかけに、再び、連絡を取るようになりました。

 

訪問の機会を与えてくれたRichardに感謝。


結婚おめでとう

萩原研究室を修了した中山君(博士前期課程)とハワイで会いました。

結婚式をハワイで挙げ、その前後の数週間もハワイで過ごすとのこと。

お祝いを兼ねて、中山君と奥様と一緒に食事へ行きました。

 

おめでとう!

末永くお幸せに。