Migrating from bash shell scripts to Python

Posted

I recently wanted to migrate from a Bash 3 shell script to Python 3. This is nothing but a brain dump comparing a few bits and bobs.

I’m pretty much doing what’s expedient, in a way that makes both the Bash script and the Python script look very similar. I am sure there is a lot of room to optimize the Python code, either through the standard library or third party modules.

What I show below is by no means the best or even the correct way, as I don’t know Python.

Hash bang

#!/bin/bash
#!/usr/bin/env python

Console output

  • Multiline echo could also take the form of echo -e "Line 1\nLine2".
  • In the second example, the single quote is required to avoid ! history expansion, or just set +o histexpand to turn it off.
  • Reading and returning environment variables is possible when sourcing the script, .e.g SET_ENV=1 . script.sh.
  • In Python, " and ' are interchangeable - use whichever that the string does not contain, e.g. "don't" vs '"do"'
  • Use Pyhton f-strings, and never go back to print('Hello ' + a + '!')! Available since Python 3.6 December 2016.
  • An error is raised if the environment variable in os.environ[var] is not defined. Use .get() with a default value instead.
  • Finally, setting an environment variable back in the calling shell is not possible!
echo "Line 1
Line 2"

a="world"
echo "Hello $a"'!'

echo $PWD
echo $SET_ENV
export NEW_ENV=0
print("""Line 1
Line 2""")

a='world'
print(f'Hello {a}!')

import os
print(os.environ['PWD'])
print(os.environ.get('SET_ENV', 'default'))

Coloured console output

  • Echo to stdout with ANSI escape codes using echo -e (which is undocumented on macos) to interpret backslash escapes
  • Here I am using a variable for clarity, but that is really not necessary.
  • Can either just use variables, or keep them together in a class as shown here.
_cx="\033[0m"
_cr="\033[1;31m"

echo -e "${_cr}RED${_cx}"
class _c:
    x = '\033[0m'
    r = '\033[1;31m'
print(f'{_c.r}RED{_c.x}')

Simple string manipulation

  • Slicing strings is remarkably similar!
  • Python has many string functions which are a lot more usable e.g. find(), index(), join(), replace(), split(), strip(), etc.
a="this is a long string"
echo ${#a}    # 21
echo ${a:8}   # "a long string"
echo ${a:8:4} # "a lo"
echo ${a::4}  # "this"
echo ${a: -6} # "string"
a="this is a long string"
print(len(a))
print(a[8:])
print(a[8:12])
print(a[:4])
print(a[-6:])

Checking for console redirection

  • Check for redirection using -t e.g. ./script.sh > x.txt or ./script.sh | cat.
  • And force output to TTY /dev/tty.
  • Output is buffered, so to ensure output is in order, make sure to flush=True or sys.stdout.flush() where needed.
  • To ignore redirection, print to the file os.ttyname(0).
if [[ -t 1 ]]
then
    echo "Stdout is terminal"
else
    echo "Stdout is redirected"
fi

echo "Always output to terminal (TTY)" >/dev/tty
import os, sys
if sys.stdout.isatty():
    print('Stdout is terminal', flush=True)
else:
    print('Stdout is redirected', flush=True)
import os
with open(os.ttyname(0), 'wt') as f:
    print('Always output to terminal (TTY)', file=f)

Cleanly setting line wrap

  • Turn off line wrap, and trap program exit to make sure we turn it back on.
tput rmam
function wrap() {
    tput smam
}
trap wrap EXIT
import os, atexit
os.system('tput rmam')
def wrap():
    os.system('tput smam')
atexit.register(wrap)

Command line arguments

  • After shift-ing, the remainder arguments are left behid, so use $1, $2, $3, etc. to get the first, second, third arguments...
  • getopt raises an error when there are incorrect arguments
  • And returns the remaining options in arguments, containing a list (arguments[1], arguments[2]...]`).
while getopts "hi:" args; do
 case $args in
  h) 
    echo "Help..." 
    exit
    ;;
  i) echo "Hi $OPTARG!" ;;
  *) echo "Error in arguments" && exit
 esac
done

shift $((OPTIND-1))
echo "Remainder: $*"
import getopt, sys
try:
    options, arguments = getopt.getopt(sys.argv[1:], 'hi:')
except getopt.GetoptError as e:
    print(f'Error in arguments: {e.msg}')
    sys.exit()
for opt, value in options:
    if opt == '-h':
        print('Help...')
        sys.exit()
    if opt == '-i':
        print(f'Hi {value}!')
print(f'Remainder: {arguments}')

Single line if-then-else

  • I realize using AND/OR to check return codes is not the same thing as a ternary operator (?:)
  • In Shell, one has to know exactly what one is trying to do, and then choose the best method, see this StackOverflow answer
a=1
b=2
[[ a == b ]] && c="yes" || c="no"
a = 1
b = 2
c = "yes" if a == b else "no"

Filtering text files

  • Check for text file(s) (via ls return code)
  • If there is, then get the most recent text (first line)
  • Finds matching lines with grep
  • Reads each line as input, and split into an array at delimiter =
  • sorted(list) returns a new list, while list.sort() does so in-place
  • When reading from a file, Python always returns the trailing newline, hence the rstrip('\n')
ls -1t /temp/*.txt 2>/dev/null 
if [[ $? -eq 0 ]]; then
    file="$(ls -1t /temp/*.txt | head -n1)"
    pattern='.*=.*'
    while read -r line
    do
        IFS='=' read -ra part <<< "$line"
        echo Key [${part[0]}] has value [${part[1]}]
    done < <(grep -i "$pattern" "$file")
fi
import re, os
from pathlib import Path
files = sorted(Path('/temp').glob('*.txt'), key=os.path.getmtime, reverse=True)
if len(files):
    pattern = '.*=.*'
    re.compile(pattern, re.IGNORECASE)
    for i, line in enumerate(open(files[0])):
        if re.search(pattern, line):
            part = line.rstrip('\n').split('=')
            print(f'Key [{part[0]}] has value [{part[1]}]')