Skip to content

Instantly share code, notes, and snippets.

@RP-3
Created April 13, 2020 23:03
Show Gist options
  • Save RP-3/41001739662c1a0383651458221e19f3 to your computer and use it in GitHub Desktop.
Save RP-3/41001739662c1a0383651458221e19f3 to your computer and use it in GitHub Desktop.
/*
IDEA:
- There are four letters, which can be represented
by two bits.
- The sequence we're looking for is 10 chars long,
so can fit inside 20 bits, which in turn fits in
a single int quite comfortably.
APPROACH:
- Traverse the sequence a single time, starting with
the first 10 chars
- On each new char, add the existing sequence to a
set. If the sequence exists already, add it to a
result set.
- Return the result set
*/
/**
* @param {string} s
* @return {string[]}
*/
var findRepeatedDnaSequences = function(s) {
const [seenOnceSet, repeatedSet] = [new Set(), new Set()];
const charMap = {A: 0, C: 1, G: 2, T: 3};
let rollingHash = 0;
for(let i=0; i<s.length; i++){
rollingHash = (rollingHash << 2) // discard first char
+ charMap[s[i]] // add new char
& ((1<<20)-1); // mask excess bits off the left
if(i<9) continue;
if(seenOnceSet.has(rollingHash)) repeatedSet.add(rollingHash);
seenOnceSet.add(rollingHash);
}
const reverseMap = ['A', 'C', 'G', 'T'];
return Array.from(repeatedSet).map((seq) => {
let result = [];
for(let i=0; i<10; i++) result.push(reverseMap[(seq >> (i*2)) & 3]);
return result.reverse().join('');
});
};
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment