Tag Archives: bash

Linux command line language translator

I wanted a command language translator that can be used in bash shell scripts. There are a couple of options available, but none that were versatile enough. A little research resulted in finding that Google Translate offered what I wanted and that there was a JSON interface, which I could use with Perl’s JSON module.

I added a few niceties such as help, multiple source options and a listing of available languages.

Entering:

tranny -t de ‘I am a citizen of Berlin’

gets this result:

Ich bin ein Bürger von Berlin

and entering:

tranny ‘Ich bin ein Bürger von Berlin’

results:

I am a citizen of Berlin

I created a Google Code project for tranny at: http://code.google.com/p/tranny/

If you need a scriptable translator, give it a try. If you run into trouble or would like to suggest changes, leave comments here or at the project.

#!/usr/bin/perl
#
# what:      tranny, a language translator
# project:   https://code.google.com/p/tranny/
# copyright: Copyright 2010, Andrew Ault
# license:   This content is released under the  http://code.google.com/p/tranny/wiki/license MIT License.
#
# Uses the JSON module from CPAN. To install: "sudo cpan JSON"
#

use strict;
use warnings;
use POSIX;
use Getopt::Std;
use JSON;
use LWP;

require 'sys/ioctl.ph';
die "no TIOCGWINSZ " unless defined &TIOCGWINSZ;

my $original;
my $winsize;
my $has_tty = 1;
my ( $screen_rows, $screen_cols, $screen_xpixels, $screen_ypixels );

my %languages = (
				  'afrikaans'      => 'af',
				  'albanian'       => 'sq',
				  'amharic'        => 'am',
				  'arabic'         => 'ar',
				  'armenian'       => 'hy',
				  'azerbaijani'    => 'az',
				  'basque'         => 'eu',
				  'belarusian'     => 'be',
				  'bengali'        => 'bn',
				  'bihari'         => 'bh',
				  'breton'         => 'br',
				  'bulgarian'      => 'bg',
				  'burmese'        => 'my',
				  'catalan'        => 'ca',
				  'cherokee'       => 'chr',
				  'chinese'        => 'zh',
				  'chinese simp'   => 'zh-cn',
				  'chinese trad'   => 'zh-tw',
				  'corsican'       => 'co',
				  'croatian'       => 'hr',
				  'czech'          => 'cs',
				  'danish'         => 'da',
				  'dhivehi'        => 'dv',
				  'dutch'          => 'nl',
				  'english'        => 'en',
				  'esperanto'      => 'eo',
				  'estonian'       => 'et',
				  'faroese'        => 'fo',
				  'filipino'       => 'tl',
				  'finnish'        => 'fi',
				  'french'         => 'fr',
				  'frisian'        => 'fy',
				  'galician'       => 'gl',
				  'georgian'       => 'ka',
				  'german'         => 'de',
				  'greek'          => 'el',
				  'gujarati'       => 'gu',
				  'haitian creole' => 'ht',
				  'hebrew'         => 'iw',
				  'hindi'          => 'hi',
				  'hungarian'      => 'hu',
				  'icelandic'      => 'is',
				  'indonesian'     => 'id',
				  'inuktitut'      => 'iu',
				  'irish'          => 'ga',
				  'italian'        => 'it',
				  'japanese'       => 'ja',
				  'javanese'       => 'jw',
				  'kannada'        => 'kn',
				  'kazakh'         => 'kk',
				  'khmer'          => 'km',
				  'korean'         => 'ko',
				  'kurdish'        => 'ku',
				  'kyrgyz'         => 'ky',
				  'lao'            => 'lo',
				  'latin'          => 'la',
				  'latvian'        => 'lv',
				  'lithuanian'     => 'lt',
				  'luxembourgish'  => 'lb',
				  'macedonian'     => 'mk',
				  'malay'          => 'ms',
				  'malayalam'      => 'ml',
				  'maltese'        => 'mt',
				  'maori'          => 'mi',
				  'marathi'        => 'mr',
				  'mongolian'      => 'mn',
				  'nepali'         => 'ne',
				  'norwegian'      => 'no',
				  'occitan'        => 'oc',
				  'oriya'          => 'or',
				  'pashto'         => 'ps',
				  'persian'        => 'fa',
				  'polish'         => 'pl',
				  'portuguese'     => 'pt',
				  'punjabi'        => 'pa',
				  'quechua'        => 'qu',
				  'romanian'       => 'ro',
				  'russian'        => 'ru',
				  'sanskrit'       => 'sa',
				  'scots_gaelic'   => 'gd',
				  'serbian'        => 'sr',
				  'sindhi'         => 'sd',
				  'sinhalese'      => 'si',
				  'slovak'         => 'sk',
				  'slovenian'      => 'sl',
				  'spanish'        => 'es',
				  'sundanese'      => 'su',
				  'swahili'        => 'sw',
				  'swedish'        => 'sv',
				  'syriac'         => 'syr',
				  'tajik'          => 'tg',
				  'tamil'          => 'ta',
				  'tatar'          => 'tt',
				  'telugu'         => 'te',
				  'thai'           => 'th',
				  'tibetan'        => 'bo',
				  'tonga'          => 'to',
				  'turkish'        => 'tr',
				  'ukrainian'      => 'uk',
				  'urdu'           => 'ur',
				  'uzbek'          => 'uz',
				  'uighur'         => 'ug',
				  'vietnamese'     => 'vi',
				  'welsh'          => 'cy',
				  'yiddish'        => 'yi',
				  'yoruba'         => 'yo',
);

# get window size for country listing
open( TTY, "+;
	close FILE;
# text is from STDIN
} else {
	# slurp STDIN
	local $/ = undef;
	$original = ;
}

my $ua = LWP::UserAgent->new;
$ua->agent("PGDict/1.0");
my $request =
  HTTP::Request->new( GET => "http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=$from|$to&q=$original" );
my $response = $ua->request($request);

if ( $response->is_success ) {
	my $perl_res = from_json( $response->content );
	if ( $perl_res->{'responseStatus'} eq '200' ) {
		print $perl_res->{'responseData'}->{'translatedText'} . "\n";
	} else {
		warn "error " . $perl_res->{'responseDetails'} . "\n";
	}
} else {
	print $response->status_line . "\n";
}

sub usage {
	print "usage: ";
	print "\ttranny -f language_code -t language_code [original text]\n\n";
	print "-f language_code (optional)\n\n";
	print "-t language_code (optional)\n\n";
	print "-o original_file (optional)\n\n";
	print "-h this help\n\n";
	print "-l language list\n\n";
	print "Tranny uses Google Translate and requires an Internet connection to work.\n";
	print "Text is translated from STDIN, from the command line or a file with -o.\n\n";
	print "By default,the 'from' language is automatically detected and translated to English (en).\n\n";
	if ( defined $opts{l} && $opts{l} == 1 ) { list_languages() }
	exit;
}

sub list_languages {
	my $num_columns = ceil( $screen_cols / 23 );
	my $num_rows = ceil ( keys(%languages) /$num_columns );
	my $row = 0;
	my $col = 0;
	my @formatted_languages = ( );
	foreach my $key ( sort ( keys(%languages) ) ) {
		$row++;
		$formatted_languages[$col][$row] =  sprintf( "%-14s %-6s", $key, $languages{$key} );
		if ( $row == $num_rows ){
			$row = 0;
			$col++;
		}
	}
	for ($row = 0; $row <= $num_rows; $row++) {
		for ($col = 0; $col <= $num_columns; $col++) {
			if ( defined $formatted_languages[$col][$row] ){
				print $formatted_languages[$col][$row];
			}
		}
		print "\n";
	}
}

Bash for the word lover

It’s a WYSIWYG world. After all, we’re nearly in the future, which I define as 2019, the year Rick Deckard chases down replicants in the Blade Runner. Still no flying cars, which is disappointing. Even so, we have Steve Jobs, so the future coming, right?

GUI everything isn’t all that it could be. For many, many tasks, it is more expeditious to open a terminal and get a bash prompt. CLI. Character. Text. It could be green on black, or it would be a rainbow on white, but it is not different from a Televideo terminal, or a Teletype for that matter. As good as the Bourne Again Shell is, it is not graphical or fancy.

What it is is efficient. For a sharp mind and one given to efficiency, the terminal is power. Want to replace frick with frack in 800 HTML files?

find ~/web/project3 -name '*.php' | xargs perl -pi -e 's/frick/frack/g'

Bam. Done.

This is why I have 3 terminals open right now. One is connected to a server somewhere in Texas. I just fixed some text on a site with a command much like the one above.

But you already knew all that. You Googled and found this page, so you are already 1337 or whatnot. How about some word power on the command line?

Install some packages

This will install the packages we will use on an Ubuntu or Debian system. For other distributions, you will need to use your distributions package system.

To install on Ubuntu or Debian, just install the needed APT packages:

sudo aptitude -y install wordnet wamerican-large curl wget an

Definitions

This grabs a definition for a word from dict.org. For “unusual” for example:

curl --stderr /dev/null dict://dict.org/d:unusual | sed '/^[.,0-9].*$/d'

Which returns:

Unusual \Un*u"su*al\, a.
Not usual; uncommon; rare; as, an unusual season; a person of
unusual grace or erudition. -- {Un*u"su*al*ly}, adv. --
{Un*u"su*al*ness}, n.
[1913 Webster]

As you can see you are using curl to request a definition for “unusual”, then using sed to filter the results, to exclude extra stuff you don’t want. You could just enter “curl dict://dict.org/d:unusual” for the raw deal. Good on ya.

You can turn this into a script:

#! /bin/bash
# display definition of a word
#
curl --stderr /dev/null dict://dict.org/d:$1 | sed '/^[.,0-9].*$/d'

Save that in a file called “def” and run “chmod +x def” to make it executable. Then “def unusual” will return the same definition. You just created your own tool. You rock.

Wordnet

How about more power? Princeton has a project called Wordnet, which organizes nouns, verbs, adjectives and adverbs into set of “cognitive synonyms” and provides tools to use this data. With Wordnet, synonyms, antonyms and other lexical relations can be found for a given word.

To show a definition, (still using “unusual” as an example):

wn unusual -over

Here’s the output:

Overview of adj unusual

The adj unusual has 3 senses (first 3 from tagged texts)

1. (24) unusual -- (not usual or common or ordinary; "a scene of unusual beauty"; "a man of unusual ability"; "cruel and unusual punishment"; "an unusual meteorite")
2. (1) strange, unusual -- (being definitely out of the ordinary and unexpected; slightly odd or even a bit weird; "a strange exaltation that was indefinable"; "a strange fantastical mind"; "what a strange sense of humor she has")
3. (1) unusual -- (not commonly encountered; "two-career families are no longer unusual")

This uses the “-over” option. Some other options are:

-synsa adjective synonyms
-synsn noun synonyms
-synsr adverb synonyms
-antsa adjective antonymns
-antsn noun antonymns
-antsr adverb antonymns

Wordnet is extensive and there are many more options, run “man wn” for more.

Crossword help

This is simply a use of grep to pattern match words in a word list file.

Use a regular expression to find a word. In quotes, start your pattern with a “^” character and end with a “$” character. Use a period “.” for each unknown character.

grep '^.a...f.c.n...$' /usr/share/dict/words

Magnificent!

Rhyming

This uses the rhyme project, which provides a rhyming dictionary for the command line.

To get, build and install rhyme on your system:

sudo aptitude -y install build-essential libgdbm-dev libreadline-dev
cd ~
DIR="src" && [ -d "$DIR" ] || mkdir "$DIR"
cd src
wget http://softlayer.dl.sourceforge.net/project/rhyme/rhyme/0.9/rhyme-0.9.tar.gz
tar -xzf rhyme-0.9.tar.gz
cd rhyme-0.9
make
sudo make install

Holy smokes, you just built software! There is no stopping you. To find a rhyme, using “house” as an example:

rhyme house

House rhymes! Lots of them.

Anagrams

Anagrams are pretty much pure word fun. It is fun to see what an anagram of your name is.

Print single-word anagrams of “andrew”:

an -l 1 andrew

Just call me the wander warden.

Ubuntu command line: see PDF of a man page

So, you use the command line. And, you’d like to look at a command’s manual page.

Wouldn’t it be handy to open the page into another window, nicely formatted, all typeset and neat? That is exactly what this little script will do.

I keep my personal scripts and executables in ~/bin (the bin directory inside my home directory). My ~.profile file in Ubuntu already had that in the path, if ~/bin exists.

If needed, create ~/bin:

mkdir ~/bin

Create a shell script called ~/bin/gman with these lines:

#!/bin/sh -e
man -t $1 | ps2pdf - > "/tmp/$1.man.pdf"
gnome-open "/tmp/$1.man.pdf"

Make the shell script executable with:

chmod +x ~/bin/gman

If needed, exit your shell and open it back up again. This would be to ensure that ~/bin is detected and added to your PATH.

Then, to use it, just type gman and the command you are interested in, ie. grep.

gman grep

This will show you a beautiful, formatted document:

SSH trick: temporarily return to your local shell

If you are using SSH to access a command shell on a remote system and you would like to temporarily return to a shell on your local system, there is an easy way to do so.

Simply type a tilda (“~”) and control-z.

This will place your SSH session into the background. You will be in a shell on your local system.

You can get the job number of the SSH session with:

jobs

Then, to return to the remote session (assuming that the job number you saw when you entered the above command was “1”), enter:

fg 1

Note that the remote shell will not print the prompt, press enter once to see the remote session prompt again.

Padding a Numeric in Bash

I needed to pad a day of the month value to 2 places in a bash script.

This is made easy by the GNU program printf, which is part of standard distributions of Linux. In the following script snippet, the current day of the month is passed from the command invocation (or, if not specified, defaulted to the current day). It is then zero-padded with printf.

TODAY=$(date +%d)
if [[ "$1" != "" ]]; then
  TODAY=$1
fi
TODAY=$(printf "%02d" $TODAY) # Zero pad day.