JavaScript's immense flexibility affords the creation of arrays nested arbitrarily. Each element in an array can be of any type and of varying depths. A function that completely flatten various nested arrays can be a useful tool.
There are two general approaches to creating this flattening function, recursion and iteration, and many variations therein. As we'll see, the most elegant looking solutions to this kind of problem tend to be recursive. Alas, recursion in JavaScript is not well optimized. On the other hand, iterative solutions in JavaScript can be more complex and require extra care with regards to mutations.
Recursion is particularly good at expressing solutions for deeply nested data structures. Consider the following example:
const flattenReducer = ( acc, item ) => {
return acc.concat(
!Array.isArray( item )
? item
: flattenRecursive( item )
)
}
const flattenRecursive = ( arr ) => arr.reduce( flattenReducer , [] )
It is relatively easy to maintain and understand. It also doesn't mutate the original data. However, the JavaScript runtime does need to maintain a call stack as deep as our arrays. While we're unlikely to have an array so deep as to halt execution, we can eliminate the need to maintain a call stack by utilizing a trampoline. A trampoline is a technique to flatten the call stack. Here is how a simple trampoline would look:
const trampoline = ( fn ) => ( ...args ) => {
let result = fn( ...args )
// `fn` will return a thunk until it is complete
while (typeof result === 'function') {
result = result()
}
return result
}
Our recursive function will need to be refactored. While we're at it, let's add some state to manage the maximum depth our function will flatten. To do this, we'll move the recursive call to the end of our function, then place it in thunk to delay execution on the trampoline.
const flattenTrampoline = trampoline(
function( arr , maxDepth = Infinity, depth = 0, flatArr = [] ) {
// if there are no more item in the array, we are done
if( !arr.length ) {
return flatArr
}
if( maxDepth === 0 ) {
return arr
}
const [ headItems, ...tailItems ] = arr
const headIsArray = Array.isArray( headItems )
const shouldGoDeeper = headIsArray && depth < maxDepth
// If the head items is an array
// - iterate with only the head
// flattened one level
// Otherwise
// - add the head item to our
// flattened array and continue
// with with the rest of the array
const nextFlat = shouldGoDeeper
? flatArr
: flatArr.concat( [ headItems ] )
const nextArr = shouldGoDeeper
? headItems.concat(tailItems)
: tailItems
const nextDepth = shouldGoDeeper
? depth + 1
: 0
return () => flattenTrampoline( nextArr, maxDepth, nextDepth, nextFlat )
}
)
The benefit to this approach is that the depth of the array will not increase our call stack. The trampoline does allow us to think recursively; however, our re-arrangement is certainly less elegant than our original solution. It is also not very performant--given the extra steps involved, it is also slower than our first mentioned recursive solution.
The silver-lining is that by re-writing our recursive problem this way, the iterative solution becomes easier to see.
We observe that for every call to our function (each iteration):
- We test to see if the head item is an array:
- If it is, we concatenate it back into the array. This reduces one array container. By definition, a
concat
operation produces a single new object from two. - If it is not, we append this item onto the flattened array we are constructing.
- If it is, we concatenate it back into the array. This reduces one array container. By definition, a
- We check to see if the original array has any items in it.
- If it does, we continue onto the next function call (iteration).
- Otherwise, we return the newly constructed flattened array.
The trampoline version of our recursive function clued us in on how to solve this iteratively. The procedure to convert is straight forward--we can identify the state we need to maintain throughout the iteration by looking at the parameters passed along each iteration. We declare those states above our loop and on each iteration. Since this will be the solution we'll move forward with, I'll add some additional runtime checks to cover in our tests. Here is our unoptimized iterative solution:
function flattenIterativelyUnoptimized( arr, maxDepth = Infinity ) {
// if there are no more item in the array, we are done
if (!arr.length) {
return []
}
if (maxDepth === 0) {
return arr
}
// state
let shallowCopyArr = [...arr]
let flatArr = []
let depth = 0
while (shallowCopyArr.length) {
let [headItems, ...tailItems] = shallowCopyArr
let headIsArray = Array.isArray(headItems)
let shouldGoDeeper = headIsArray && depth < maxDepth
// update state
flatArr = shouldGoDeeper ? flatArr : flatArr.concat([headItems])
shallowCopyArr = shouldGoDeeper ? headItems.concat(tailItems) : tailItems
depth = shouldGoDeeper ? depth + 1 : 0
}
// done
return flatArr
}
The only gotcha here is making sure we don't mutate the original array. If performance is of the utmost concern, say, because we will be calling this function hundreds of thousands of times for a long list of arrays, then the shallow copy can be done before the function call. Otherwise, it is prudent to handle this operation inside of the our function to save our users any surprises.
We started our iterative version by mimicking the trampoline-based recursive version. We can improve the performance of the iterative version by embracing more mutations within the scope of our function. The while
could be re-written as follows:
while( shallowCopyArr.length ){
let headItems = shallowCopyArr.shift()
if( Array.isArray( headItems ) && depth + 1 < maxDepth) {
// flatArr unchanged
shallowCopyArr = headItems.concat( shallowCopyArr ) // flatten once
depth++
} else {
// shallowCopyArr unchanged
flatArr.push( headItems ) // mutate, saves copying
depth = 0
}
}
This version is four times faster than the trampoline version it was based on. Here is a look at the relative benchmark performance:
![image-20190415165326266](/Users/babak/Library/Application Support/typora-user-images/image-20190415165326266.png)
Our test should resemble the way our function can be used. We should also include some common scenarios that will lead to an error. There are many ways that a parameter passed into our function can be wrong.
if (!Array.isArray(arr)) {
throw new Error( "The array to flatten must be an array" )
}
if (maxDepth < 0) {
throw new Error( "Maximum depth must be a non-negative whole number" )
}
if (!Number.isInteger(maxDepth) && maxDepth !== Infinity ) {
throw new Error( "Maximum depth must be a non-negative whole number" )
}
Static type checking is a great way to catch costly runtime errors at compile time. That said, strictly typing such a versatile function as our flatten
function is complex in any language. For example, if we want to express a type for a function that takes a homogeneous array and flattens it once, we could write the following type definitions (TypeScript):
declare function flattenOnce<A>( arr: Array< A | A[] > ): A[]
Or, in our case
function flattenOnce<A>( arr: Array< A | A[] > ): A[] {
return flatten( arr , 1 )
}
But it is challenging, if at all currently possible, to express a function that can take any kind of data, with arrays nested to various degrees which will then return a flattened structure to any desired degree. That said, we can still construct a type that will enable the user to utilize static checking for the expected input.
declare function flatten<ExpectedInput extends unknown[], KnownOutput extends unknown[] = unknown[]>(arr: ExpectedInput, maxDepth?: number): KnownOutput
JavaScript is a flexible loosely typed programming language. It has a rich community and supports many styles of programming. When it comes to performance optimization, we can utilize it's procedural features to avoid duplication of lists in memory when it is unnecessary. By localizing our mutations, we can keep our functions pure, and ensuring we avoid mutations and undesired side-effects throughout the rest of our app.
The above code doesn't work, it will flatten the array totally regardless of the maxDepth. Because we are setting depth to be zero whenever it is not an array.