Wanted to show off at least part of my instructables that I have created over the last year on our local web server. Downloading the whole web is still a work in progress. There had to be a way to get the pages saving space and keeping file management to a minimum. Chose to use PDFit that is an add-on for Firefox to download each instructable as one page. So far so good. Generally web servers do not display PDF pages directly. Found that pdf files could be converted to html files using the poppler-utils. Awesome. Just have to create a web page to act as a table of contents for the files. more info at: http://www.instructables.com/id/Display-PDF-files-with-a-linux-server/.

One way to get data off a web page from the command line is called scraping. Here are a few examples of how to do that.

Some code snippets to play with:

#1
—————————————————————-

#!/bin/sh
# Determine the current U.S. Terrorist Threat Advisory level, and output it to
# a serial device
#
# Written by Matt Mets in 2008
# Modified by Computothought in 2011
# This code is released into the public domain, attribution is appreciated

#### Configuration Options ##############################################

# Location of the threat alert level XML document
SRC_URL="http://www.dhs.gov/dhspublic/getAdvisoryCondition"
# Not sure whether this site ever gets updated
#<?xml version="1.0" encoding="UTF-8" ?> 
#<THREAT_ADVISORY CONDITION="LOW" /> 
# Left for compatibility to original use of this code
# Serial device to output the result to (not used at this time)
# TTY_DEV="/dev/ttyS1"

#########################################################################

# Retrieve the current threat level
# We are expecting something like:
# <?xml version="1.0" encoding="UTF-8" ?>
# <THREAT_ADVISORY CONDITION="ELEVATED" />
ALERT_DOC=`wget ${SRC_URL} -q -O -`

# Do some rudimentary parsing using regular expressions to isolate the current
# threat level
# TODO: can this be done in one pass?
TMP=${ALERT_DOC#*CONDITION="}
ALERT_LEVEL=${TMP%" *}

# Convert the output into a serial command
# (The current system expects just a number 1-5)
case ${ALERT_LEVEL} in
    SEVERE)     COMMAND=5 ;;
    HIGH)       COMMAND=4 ;;
    ELEVATED)   COMMAND=3 ;;
    GUARDED)    COMMAND=2 ;;
    LOW)        COMMAND=1 ;;
esac

# If a threat level was retrieved, output it to the device
if [ -n ${COMMAND} ]
then
    # Send the command to the serial port (-n supresses the newline character)
    echo -n "Threat level is: "
    echo  ${COMMAND} ${ALERT_LEVEL} # > ${TTY_DEV}
    echo  Threat level is: ${COMMAND} ${ALERT_LEVEL} > tl
    twidge update < tl
    exit 0
else
    exit 1
fi

$ ./tlp
Threat level is: 1 LOW

This might work better: (at least for foday.)
grp1:

# Get today's Error level
lynx -width 20 -dump "http://www.usasecure.org/threat.php" > terrorthreat ; cat threatlevel | grep "Current Threat Level:"  > tl
cut -b 53-92 tl

$ ./gtp1
Current Threat Level: ELEVATED CONDITION

#2
—————————————————————-

gwp:

# Get today's weather
ZIP=77331
elinks "http://www.weather.com/weather/print/$ZIP" > weather ; cat weather | grep Today

$ ./gwp
Today Partly Cloudy / Wind 93°/71° 10 %

Varation:

gwp1:

# Get today's weather
ZIP=$1
echo -n "The weather for $ZIP: "
elinks "http://www.weather.com/weather/print/$ZIP" > weather ; cat weather | grep Today

$ ./gwp1 77482
The weather for 77482: Today Partly Cloudy 92°/73° 20 %

gw:

#!/bin/bash
# weather.bash
#desc Find current weather stats for your zip code
#desc Ex: ${trig}weather 03301
# weather 1.1 -Crouse
# With Updates by Jeo
# Modified to run stand alone by Brian Masinick
# Example: !weather 03301
# Usage: weather + zipcode

zipcode=$1
if [ -z "$zipcode" ]; then
echo "Please provide a zip code (Ex: weather 03301)"
else
unset response
# Add a backslash () after -dump-width 300 if this line splits
# across two lines; Should be one distinct line:
WEATHER="$(elinks -dump -dump-width 300 "http://mobile.wunderground.com/cgi-bin/findweather/getForecast?query=${zipcode}" | grep -A16 Updated)"

if [ -z "$WEATHER" ]; then
response="No Results for $zipcode"
echo "${response}"
else
response[1]="$(echo "$WEATHER" | grep -Eo 'Observed.*' | sed s/ *| */|/g | awk -F| '{print "Weather: " $1}')"
response[2]="$(echo "$WEATHER" | grep -Eo 'Updated.*' |sed s/ *| */|/g |awk -F| '{print $1}')"
response[3]="$(echo "$WEATHER" | grep -Eo 'Temperature.*' | sed s/ *| */|/g | awk -F| '{print $1 ": " $2}' | sed s/DEG/ /g )"
response[4]="$(echo "$WEATHER" | grep -Eo 'Windchill.*' | sed s/ *| */|/g | awk -F| '{print $1 ": " $2}'| sed s/DEG/ /g)"
response[5]="$(echo "$WEATHER" | grep -Eo 'Wind .*' | sed s/ *| */|/g | awk -F| '{print $1 ": " $2}')"
response[6]="$(echo "$WEATHER" | grep -Eo 'Conditions.*' | sed s/ *| */|/g | awk -F| '{print $1 ": " $2}')"
response[7]="$(echo "$WEATHER" | grep -Eo 'Humidity.*' |sed s/ *| */|/g | awk -F| '{print $1 ": " $2}')"
response[8]="$(echo "$WEATHER" | grep -Eo 'Dew.Point.*' |sed s/ *| */|/g | awk -F| '{print $1 ": " $2}'| sed s/DEG/ /g)"
response[9]="$(echo "$WEATHER" | grep -Eo 'Pressure.*' |sed s/ *| */|/g | awk -F| '{print $1 ": " $2}')"

for index in `seq 1 9`; do
if [ -n "${response[$index]}" ]; then
echo "${response[$index]}"
fi
let "index = $index + 1"
done
fi
fi

$ ./gw 77331
Weather: Observed at Wolf Creek Air Cond., Coldspring, Texas
Updated: 12:52 AM CDT on June 22, 2011
Temperature: 78.9°F / 26.1°C
Wind: WNW at 0.0 mph / 0.0 km/h
Conditions: Overcast
Humidity: 70%
Dew Point: 68°F / 20°C
Pressure: 29.90 in / 1012.4 hPa (Rising)

#3
—————————————————————
ghp:

# Get today's horoscope
hsign="VIRGO"
lynx -width 1000   -dump "http://www.creators.com/lifestylefeatures/horoscopes/horoscopes-by-holiday.html" > hscope ; cat hscope | grep $hsign

$ ./ghp
VIRGO (Aug. 23-Sept. 22). Pride of ownership applies to all of your possessions, and you’ll take care that they sparkle, shine and really work. Tonight, you’ll be reminded how much you cherish and need plenty of space to do your thing.

Variation:
ghp1:

# Get today's horoscope
hsign=$1
date +%D
echo -n "Today's horoscope for:"
lynx -width 1000   -dump "http://www.creators.com/lifestylefeatures/horoscopes/horoscopes-by-holiday.html" > hscope ; cat hscope | grep $hsign

$ ./ghp1 PISCES
05/31/11
Today’s horoscope for: PISCES (Feb. 19-March 20). There are so many people who are trying to do what you already do so well. You really are doing the world a disservice unless you share what you know. In your heart, you are a teacher.

Yet another way:
wghp

# Get today's horoscope
hsign=$1
wget -q "http://www.creators.com/lifestylefeatures/horoscopes/horoscopes-by-holiday.html"
cat horoscopes-by-holiday.html  | grep $hsign > hscope
cat hscope | sed 's/...(.*)..../1/'
rm horoscopes-by-holiday.html

$ ./wghp GEMINI
GEMINI (May 21-June 21). A distracting influence may actually do you a favor. Some will find it difficult to get back to work after the disruption, but you’ll find the break in order to be creatively invigorating.

Play the lottery? Here is a simple random number generator to pontificate the numbers. Run it as many times as there are picks.

lg.sh:

echo Lottery generator
echo
echo -n "Enter number of balls: "
read nodf
echo -n "Enter number of choices: "
read b
declare -i X=$b
for i in $(seq 1 1 $nodf)
do
NUM=$[ ( $RANDOM % $X ) + 1 ];
echo "The winner is for ball number $i:" $NUM
done

$ ./lg.sh
Lottery generator

Enter number of balls: 5
Enter number of choices: 50
The winner is for ball number 1: 32
The winner is for ball number 2: 43
The winner is for ball number 3: 23
The winner is for ball number 4: 18
The winner is for ball number 5: 38

Don’t forget to “chmod +x” it. Great for choosing winners randomly in a contest. Write the number down after each toss. I will let you mod the code to save the numbers.

Note: it does not check for duplicates.

Have not been very technical in a while so I hope these small morsels will help you in some way.

Advertisements