Bash Strings

2014-09-01

My bashlib project was supposed to be pretty much complete, but I've come back to it recently.

I've now implimented a few nice PHP string functions in to bash that work almost exactly the same as the PHP counterpart. I didn't actually look at how the PHP functions were created and instead just went off the relevant function documentation pages.

The first function I thought would be nice was substr; this is more flexible than the standard bash sub string notation. For example:

START=6
END=8
STRING="this is a string!"
STRING=${STRING:${START}:${END}}
echo ${STRING}

This would output "s a stri". Maybe what we are after, but maybe not. But that's as far as we can go with direct sub strings. Anything else has to be calculated before hand. One of the primary benefits to how PHP handles sub strings is we can actually work backwards. For example:

$STRING = substr ( "this is a string!", 6, -3 );
echo $STRING;

This will now give us the same result as out bash sub string; "s a stri". Except, we didn't know how long that sub string was going to be; we just knew that relative to the main string, 3 characters from the end is where we want to finish. of course we can continue to use 6 and 8 as the values in PHP and we'd get the same results, but by working backwards from the end of the main string, we open up a lot more possibilities.

Trying to simulate this is bash is trickier than you may imagine because of the base line we start with. To counter this, you have to do a fair amount of maths on the main string value. In the end I came up with this:

string.substr () {
	declare r t x y
	[[ ! ${1} || ! ${2} ]] && return 1
	[[ ! ${2} =~ ^[-0-9]+$ ]] && return 2
	[[ ${3} && ! ${3} =~ ^[-0-9]+$ ]] && return 3
	[[ ${2:0:1} == "-" ]] && r="0" || r="1"
	[[ ${3:0:1} == "-" ]] && t="0" || t="1"
	case ${r} in
		0)
			if [[ ! ${3} ]]; then
				printf "%s" "${1:$(( ${#1} - ${2:1} ))}"
			else
				case ${t} in
					0)
						x=$(( ${#1} - ${2:1} ))
						y=$(( ( ${#1} - ${3:1} ) - ( ${#1} - ${2:1} ) ))
						(( y <= 0 )) && y=${#1}
						printf "%s" "${1:${x}:${y}}"
					;;
					1)
						x=$(( ${#1} - ${2:1} ))
						y=$(( ( ${#1} - ${2:1} ) - ( ${#1} - ${3} ) ))
						(( y >= 0 )) && y=${#1}
					printf "%s" "${1:${x}:${y}}" ;;
				esac
			fi
		;;
		1)
			if [[ ! ${3} ]]; then
				printf "%s" "${1:${2}}"
			else
				case ${t} in
					0)
						y=$(( ${2} - ( ${#1} - ${3:1} ) ))
						(( y >= 0 )) && y="-${#1}"
						printf "%s" "${1:${2}:${y:1}}"
					;;
					1) printf "%s" "${1:${2}:${3}}" ;;
				esac
			fi
		;;
	esac
	return 0
}

This took a fair amount of thinking about the get it to behave exactly as PHP's function does. However, it does all the heavy lifting for you, and allows you to write something that's almost as elegant as PHP's solution:

STRING=$( string.substr "this is a string!" 6 8 )
echo ${STRING}
STRING=$( string.substr "this is a string!" 6 -3 )
echo ${STRING}

After feeling good about my success with sub strings, I decided to continue that trend with my next two functions. Again, I checked PHP's implimentation of these and tried to follow suit:

string.strpos () {
	declare -i i OFFSET
	[[ ! ${1} || ! ${2} ]] && return 2
	[[ ${3} && ! ${3} =~ ^[0-9]+$ ]] && return 3
	[[ ! ${3} && ${1} != *${2}* ]] && printf "%i" "-1" && return 1
	[[ ${3} && ${1:${3}} != *${2}* ]] && printf "%i" "-1" && return 1
	[[ ${3} ]] && OFFSET="${3}" || OFFSET="0"
	for (( i=$(( 0 + OFFSET )); i<${#1}; i++ )); do
		[[ ${1:${i}:${#2}} == ${2} ]] && printf "%i" "$(( ${i} + 1 ))" && break
	done
	return 0
}

string.strrchr () {
	declare -i i POS
	[[ ! ${1} || ! ${2} ]] && return 2
	[[ ${1} != *${2}* ]] && printf "%i" "-1" && return 1
	for (( i=0; i<${#1}; i++ )); do
		[[ ${1:${i}:${#2}} == ${2} ]] && POS=$(( ${i} + 1 ))
	done
	printf "%i" "${POS}"
	return 0
}

Just like PHP, these functions will search a string for the first and last instances of a second substring. Trying to work out how to iterate over the characters in a string took most of the time, but once I'd figured it out it was pretty easy. I even included the offset ability of PHP. These allow you to do something like this:

STR[0]=$( string.strpos "string some text string" "str" )
STR[1]=$( string.strrchr "string some text string" "str" )
STR[2]=$( string.strpos "string some text string" "str" 1 )

echo ${STR[0]}
echo ${STR[1]}
echo ${STR[2]}

Which give you the results 1, 18, 18.

So, with combining the functions above, you can create substrings based on search results of a main string. As usual you can find these added to bashlib (link at the top of the homepage). I've also added a couple of other prototypes (like file.sock) but that will be for another day (also, enjoy the load of conversion functions I've added -- not sure how useful they really are though).