diff options
author | Luke Shumaker <LukeShu@sbcglobal.net> | 2013-10-12 13:47:42 -0400 |
---|---|---|
committer | Luke Shumaker <LukeShu@sbcglobal.net> | 2013-10-12 13:47:42 -0400 |
commit | 6a42c8de66e3b2dc7293ddeadaa3ee396db2624d (patch) | |
tree | 67a027b892d3122662526504dd6d11e8dea02ca1 /public/bash-arrays.md |
initial commit
Diffstat (limited to 'public/bash-arrays.md')
-rw-r--r-- | public/bash-arrays.md | 201 |
1 files changed, 201 insertions, 0 deletions
diff --git a/public/bash-arrays.md b/public/bash-arrays.md new file mode 100644 index 0000000..697d018 --- /dev/null +++ b/public/bash-arrays.md @@ -0,0 +1,201 @@ +Bash arrays +=========== +:copyright 2013 Luke Shumaker + +Way too many people don't understand Bash arrays. Many of them argue +that if you need arrays, you shouldn't be using Bash. If we reject +the notion that one should never use Bash for scripting, then thinking +you don't need Bash arrays is what I like to call "wrong". + +The simple expanation of why everybody who programs in Bash needs to +understand arrays is this: command line arguments are exposed as an +array. Does your script take any arguments on the command line? +Great, you need to work with an array! + +General array syntax +-------------------- + +The most important things to understanding arrays is to quote them, +and understanding the difference between `@` and `*`. + +<table> + <caption> + <h1>Getting the entire array</h1> + <p>There is <em>no</em> valid reason to not wrap these in double + quotes.</p> + </caption> + <tbody> + <tr> + <td><code>"${array[@]}"</code></td> + <td>Returns every element of the array as a separate token.</td> + </tr><tr> + <td><code>"${array[*]}"</code></td> + <td>Returns every element of the array in a single + whitepace-separated string.</td> + </tr> + </tbody> +</table> + +It's really that simple—that covers most usages of arrays, and most of +the mistakes made with them. + +To help you understand the difference between `@` and `*`, here is a +sample. + +<pre><code>#!/bin/bash +array=(foo bar baz) +for item in "${array[@]}"; do + echo " - <${item}>" +done<hr> - <foo> + - <bar> + - <baz></code></pre> + +<pre><code>#!/bin/bash +array=(foo bar baz) +for item in "${array[@]}"; do + echo " - <${item}>" +done<hr> - <foo bar baz></code></pre> + +To get individual entries, the syntax is +<code>${array[<var>n</var>]}</code>, where <var>n</var> starts at 0. + +<table> + <caption> + <h1>Getting a single entry from the array</h1> + </caption> + <tbody> + <tr> + <td><code>"${array[<var>n</var>]}"</code></td> + <td>Returns the <var>n</var>th entry of the array, where the + first entry is at <var>n</var>=0.</td> + </tr> + </tbody> +</table> + +To get a subset of the array, there are a few options (like normal, +switch between `@` and `*` to switch between +getting it as separate items, and as a whitespace-separated string): + +<table> + <caption> + <h1>Getting subsets of an array</h1> + <p>Substitute <code>*</code> for <code>@</code> to get the subset + as a whitespace-separated string instead of separate tokens, as + described above.</p> + <p>Again, there is no valid reason to not wrap each of these in + double quotes.</p> + </caption> + <tbody> + <tr> + <td><code>"${array[@]:<var>start</var>}"</code></td> + <td>Returns from <var>n</var>=<var>start</var> to the end of the array.</td> + </tr><tr> + <td><code>"${array[@]:<var>start</var>:<var>count</var>}"</code></td> + <td>Returns <var>count</var> entries, starting at <var>n</var>=<var>start</var>.</td> + </tr><tr> + <td><code>"${array[@]::<var>count</var>}"</code></td> + <td>Returns <var>count</var> entries from the beginning of the array.</td> + </tr> + </tbody> +</table> + +Notice that `"${array[@]}"` is equivalent to `"${array[@]:0}"`. + +<table> + <caption> + <h1>Getting the length of an array</h1> + <p>The is the only situation where there is no difference + between <code>@</code> and <code>*</code>.</p> + </caption> + <tbody> + <tr> + <td> + <code>${#array[@]}</code> + <br>or<br> + <code>${#array[*]}</code> + </td> + <td> + Returns the length of the array + </td> + </tr> + </tbody> +</table> + +Accessing the arguments array +----------------------------- + +Accessing the arguments is mostly that simple, but that array doesn't +actually have a variable name. It's special. Instead, it is exposed +through a series of special variables (normal variables can only start +with letters and underscore), that *mostly* match up with the normal +array syntax. + +<table> + <caption> + <h1>Accessing the arguments array</h1> + <aside>Note that for values of <var>n</var> with more than 1 + digit, you need to wrap it in <code>{}</code>. + Otherwise, <code>"$10"</code> would be parsed + as <code>"${1}0"</code>.</aside> + </caption> + <tbody> + <tr><th colspan=2>Individual entries</th></tr> + <tr><td><code>${array[0]}</code></td><td><code>$0</code></td></tr> + <tr><td><code>${array[1]}</code></td><td><code>$1</code></td></tr> + <tr><td colspan=2 style="text-align:center">...</td></tr> + <tr><td><code>${array[9]}</code></td><td><code>$9</code></td></tr> + <tr><td><code>${array[10]}</code></td><td><code>${10}</code></td></tr> + <tr><td colspan=2 style="text-align:center">...</td></tr> + <tr><td><code>${array[<var>n</var>]}</code></td><td><code>${<var>n</var>}</code></td></tr> + <tr><th colspan=2>Subset arrays (array)</th></tr> + <tr><td><code>"${array[@]}"</code></td><td><code>"${@:0}"</code></td></tr> + <tr><td><code>"${array[@]:1}"</code></td><td><code>"$@"</code></td></tr> + <tr><td><code>"${array[@]:<var>pos</var>}"</code></td><td><code>"${@:<var>pos</var>}"</code></td></tr> + <tr><td><code>"${array[@]:<var>pos</var>:<var>len</var>}"</code></td><td><code>"${@:<var>pos</var>:<var>len</var>}"</code></td></tr> + <tr><td><code>"${array[@]::<var>len</var>}"</code></td><td><code>"${@::<var>len</var>}"</code></td></tr> + <tr><th colspan=2>Subset arrays (string)</th></tr> + <tr><td><code>"${array[*]}"</code></td><td><code>"${*:0}"</code></td></tr> + <tr><td><code>"${array[*]:1}"</code></td><td><code>"$*"</code></td></tr> + <tr><td><code>"${array[*]:<var>pos</var>}"</code></td><td><code>"${*:<var>pos</var>}"</code></td></tr> + <tr><td><code>"${array[*]:<var>pos</var>:<var>len</var>}"</code></td><td><code>"${*:<var>pos</var>:<var>len</var>}"</code></td></tr> + <tr><td><code>"${array[*]::<var>len</var>}"</code></td><td><code>"${*::<var>len</var>}"</code></td></tr> + <tr><th colspan=2>Array length</th></tr> + <tr><td><code>${#array[@]}</code></td><td><code>$#</code> + 1</td></tr> + </tbody> +</table> + +Did notice what was inconsistent? The variables `$*`, `$@`, and `$#` +behave like the <var>n</var>=0 entry doesn't exist. + +<table> + <caption> + <h1>Inconsistencies</h1> + </caption> + <tbody> + <tr> + <th colspan=3><code>@</code> or <code>*</code></th> + </tr><tr> + <td><code>"${array[@]}"</code></td> + <td>→</td> + <td><code>"${array[@]:0}"</code></td> + </tr><tr> + <td><code>"${@}"</code></td> + <td>→</td> + <td><code>"${@:1}"</code></td> + </tr><tr> + <th colspan=3><code>#</code></th> + </tr><tr> + <td><code>"${#array[@]}"</code></td> + <td>→</td> + <td>length</td> + </tr><tr> + <td><code>"${#}"</code></td> + <td>→</td> + <td>length-1</td> + </tr> + </tbody> +</table> + +These make sense because argument 0 is the name of the script—we +almost never want that when parsing arguments. You'd spend more code +getting the values that it currently gives you. |