Posts Tagged 'PHP'
2010-01-27
For a recent video editing project, I really wanted to have some mellow classical music playing in the background. Due to licensing restrictions, the classical music needed to be either in the Public Domain or released under a very permissive Creative Commons license.
Fortunately, the United States Air Force Band has quite a few Public Domain recordings, of Public Domain works, on their website. Unfortunately, the Air Force website leaves a lot to be desired regarding listening to and downloading tunes.
The website uses javascript to open a popup player, and this is terrible. Actually, it is good, because the javascript that launches the player requires the full URL of the mp3 file that the player is supposed to play.
What this means, is that I need only sift through the source of the page containing links to audio files and pull out the URL strings that start with "http://" and end with ".mp3". This sounds like a job for a non-greedy Regular Expression.
Enter the PHP: What? you don't write command line scripts in PHP?
Now stop reading, and go listen to some Vivaldi.
Fortunately, the United States Air Force Band has quite a few Public Domain recordings, of Public Domain works, on their website. Unfortunately, the Air Force website leaves a lot to be desired regarding listening to and downloading tunes.
The website uses javascript to open a popup player, and this is terrible. Actually, it is good, because the javascript that launches the player requires the full URL of the mp3 file that the player is supposed to play.
What this means, is that I need only sift through the source of the page containing links to audio files and pull out the URL strings that start with "http://" and end with ".mp3". This sounds like a job for a non-greedy Regular Expression.
Enter the PHP: What? you don't write command line scripts in PHP?
The script (which I named echo_mp3.php) takes the URL of a webpage as an argument and prints out the URLs of the mp3s that are found in the webpages code. For my usage, I wrote the list of mp3s to a file by executing#!/usr/bin/env php <?php if ( count($argv)>1 ) { $file = $argv[1]; $text = file_get_contents( $file ); $matches=""; preg_match_all("/http:\/\/[A-Za-z\/_\.0-9]*?\.mp3/",$text,$matches); if(count($matches[0]) ) { foreach($matches[0] as $match) { echo "$match\n"; } } } ?>
and then used wget to batch download the files in the list../echo_mp3.php http://example.com/some/file.htm >> mp3list.txt
Now stop reading, and go listen to some Vivaldi.
2009-11-02
For a while now, I've been keeping track of phone number by enter them into a plain text file on my laptop and searching for numbers with a simple shell script named phone
Since "cloud" is a Web 2.0 buzzword, I figured I should make the user interface of my phone number finder behave in an AJAXy way, but without Javascript or XML. ( mark off Web 2.0,AJAX, Javascript, and XML )
Here is the HTML of the interface
A simple form makes a request and the results are displayed in a targetted inline frame. Thus, there is not a full page refresh when the user selects a different name to search for. Oh shiney!
On to the searching code
This is a very simple PHP script to search for matching lines in the text file.
A working example of this code is viewable at http://www.jezra.net/code_examples/phonenumbers
Things that could have been done different:
The key here was development time. Comparitive to the actual development time, creating a database, adding the numbers, and writing the code to retrieve the numbers would add a significant amount of time. Unfortunately, I didn't think of using a REGEX until after I had written the code, and since the code worked as expected, I felt no need to change it.
Now quit reading, and call a friend.
This is fine and dandy if I'm actually using my laptop, but I needed a way to access the phone numbers from all of my computers. If you are playing Buzzword Bingo, put a checkmark on "cloud". There is a low-power computer in my home network that will perform the task of "cloud based phone number listing" wonderfully.#!/bin/sh clear cat /path/to/phonelist.txt | grep -i $1
Since "cloud" is a Web 2.0 buzzword, I figured I should make the user interface of my phone number finder behave in an AJAXy way, but without Javascript or XML. ( mark off Web 2.0,AJAX, Javascript, and XML )
Here is the HTML of the interface
<html> <head> <title>phone</title> <style type="text/css"> #result_frame { border:0px; width:50%; height:50% } </style> </head> <body> <form name='p' action='phone.php' method='post' target='result_frame'> <input type="text" name="search_string"> <input type="submit" value="find"> </form> <iframe name="result_frame" id="result_frame"> </iframe> </body> </html>
A simple form makes a request and the results are displayed in a targetted inline frame. Thus, there is not a full page refresh when the user selects a different name to search for. Oh shiney!
On to the searching code
<?php //get the search string from the POST data $s_str = $_POST['search_string']; //escape the string $esc_str = escapeshellcmd($s_str); //where is the phone file? $phone_file = "phonelist.txt"; $results = ""; //read the file into an array $file_lines = file($phone_file); //loop through the array of lines foreach($file_lines as $line) { //is the search string in the line? if( stristr($line,$s_str) ) { //add the line to the results. $results.=$line; } } //are there results? if(!$results) { //if not, let the user know that there were no matches $results = "no matches found for \"$s_str\""; } //create some basic HTML to display to the viewer echo "<html><head><title>Results for $s_str</title></head>"; echo "<body><pre>"; print_r($results); echo "</pre></body></html>"; ?>
This is a very simple PHP script to search for matching lines in the text file.
A working example of this code is viewable at http://www.jezra.net/code_examples/phonenumbers
Things that could have been done different:
- the numbers could be put into a database
- a regular expression could search the text file
The key here was development time. Comparitive to the actual development time, creating a database, adding the numbers, and writing the code to retrieve the numbers would add a significant amount of time. Unfortunately, I didn't think of using a REGEX until after I had written the code, and since the code worked as expected, I felt no need to change it.
Now quit reading, and call a friend.
Comments
handy Web 2.0 tutorial, Jezra. may play with this in the future. You might want to handle the error when the user clicks find on empty search query, though.
HA! Thanks digi. The example PHP code as been edited to only search the text file if the length of the search string is greater than zero.
if(strlen($s_str)>0)
{
//do the search
}
if(strlen($s_str)>0)
{
//do the search
}
2009-01-30
A while ago, I needed to spider all of the pages in a dynamically generated website and do something with the data. Since the spider only needed to work with the one website, I was free to add as much site specific code as I needed. The script has been copied and adjusted to fit the needs of the whatever site I happen to be working on at the time. In a nut shell, the script gets passed a web page address, downloads the page, uses a regex to find all of the links, parses the links, and then recurses through the links.
The current iteration of the script spiders a "url-beautified" site and generates a very basic sitemap suitable for submission to Google or Yahoo, and the current code looks like this:
As the script runs, it will print to standard out the paths of the files that are being spidered and it will also print a warning about any missing webpages. When I run the script against my jezra.net site, the output is:
good, no missing files. The sitemap that is generated by the script is as follows:
Hey, it gets the job done.
The current iteration of the script spiders a "url-beautified" site and generates a very basic sitemap suitable for submission to Google or Yahoo, and the current code looks like this:
#!/usr/bin/env php <?php function getrealPath($baseDir,$link) { $newDir=''; $rise=0; //count the number of ../ in the link for($i=0;$i<strlen($link)-4 ; $i++ ) { if($link[$i]=="." && $link[$i+1]=="." && $link[$i+2]=="/") { $rise++; } } $folders = explode("/",$baseDir); //print_r($folders); for($i=0 ; $i < count($folders)-$rise ; $i++) { $newDir .= $folders[$i]."/"; } $link = str_replace("../","",$link); $link = str_replace(" ","%20",$link); #replace double forward slashes //$link = str_replace("//","/",$link); return $newDir.$link; } function getLinks($file,$parent='') { $dirname=''; global $brokenLinks; global $startTime; global $pageList; global $brokenLinksString; $text = @file_get_contents($file);//get the text from the file if(trim($text)=="")//the file can't be opened { $brokenLinksString.="$parentnt$filenn"; echo "--Missing File--nt$filen"; return 1; } echo "$filen"; $text = str_replace(array("href =","HREF =","href= ","HREF= ","HREF="),"href=",$text); preg_match_all("/href=["|'](.*?)["|']/", $text, $link_results); foreach ($link_results[1] as $link) { $link_is_absolute=false; //clean up ampersands in the link $link = str_replace("&","&",$link); if(stristr($link,"mailto") ) { continue; } $processedLink = strtolower($link); //find out if the link is external before we figure out the realLink if (stristr($processedLink,"http:") || stristr($processedLink,"https:") ) { if(strpos($processedLink,BaseHREF)===0 ) { //this link is and absolute path }else{ //this link is external continue; } } //ignore all extensions if(strstr($processedLink,".") || strstr($processedLink,"mailto:") ) { continue; } //if the file is absolute local we don't need to discover the abspath if($link_is_absolute) { $absLink = $processedLink; }else{ $absLink = getRealPath(BaseHREF."$dirname",$processedLink); } if( !in_array($absLink,$pageList)) { array_push($pageList,$absLink); getLinks($absLink,$file); } } return 1; } function xmlEncode($string) { $newString = str_replace(array("&","'",""",">","<"),array("&","'",""",">","<"),$string); return $newString; } /*begin the script*/ #make some global variables $pageList = array(); $brokenLinksString=""; $brokenLinks = array(); define ("BaseHREF",$argv[1]); getLinks(BaseHREF); //find the links in the file //make the sitemap foreach($pageList as $filePath) { //echo $filePath."n"; //create the url for our Google Site Map $subPath = explode("/",$filePath); $folderDepth = count($subPath)-1; $subPath[$folderDepth] = xmlEncode( $subPath[$folderDepth] ); $encodedPath = implode("/",$subPath); $loc = $filePath; $urlString.= "tn" ; $urlString.= "tt$locn"; if(stristr($filePath,"home.") ) { $priority = 1-($folderDepth*0.1); }else{ $priority = 0.8-($folderDepth*0.1);; } //$priority=( stristr($filePath,"home.") )?0.9:0.5; $urlString.="tt $priorityn"; $urlString.= "tn"; } $siteMapText="n"; $siteMapText.=" ; $siteMapText.=' xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemaps/0.9 http://www.sitemaps.org/schemas/sitemaps/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'; $siteMapText.="n"; $siteMapText.=$urlString; $siteMapText.=""; $fhandle = fopen("sitemap.xml","w"); fwrite($fhandle,$siteMapText); fclose($fhandle); ?>
As the script runs, it will print to standard out the paths of the files that are being spidered and it will also print a warning about any missing webpages. When I run the script against my jezra.net site, the output is:
http://www.jezra.net http://www.jezra.net/home http://www.jezra.net/projects http://www.jezra.net/music http://www.jezra.net/contact http://www.jezra.net/projects/hubcap http://www.jezra.net/projects/serial_switch http://www.jezra.net/projects/svggraph http://www.jezra.net/projects/vplayer
good, no missing files. The sitemap that is generated by the script is as follows:
<?xml version="1.0" encoding="UTF-8" ?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://www.jezra.net/home</loc> <priority>0.5</priority> </url> <url> <loc>http://www.jezra.net/projects</loc> <priority>0.5</priority> </url> <url> <loc>http://www.jezra.net/music</loc> <priority>0.5</priority> </url> <url> <loc>http://www.jezra.net/contact</loc> <priority>0.5</priority> </url> <url> <loc>http://www.jezra.net/projects/hubcap</loc> <priority>0.4</priority> </url> <url> <loc>http://www.jezra.net/projects/serial_switch</loc> <priority>0.4</priority> </url> <url> <loc>http://www.jezra.net/projects/svggraph</loc> <priority>0.4</priority> </url> <url> <loc>http://www.jezra.net/projects/vplayer</loc> <priority>0.4</priority> </url> </urlset>
Hey, it gets the job done.
Comments
2008-11-26
Hindsight is something something something
I didn't want to write a web framework, and in hindsight, it probably would have been better if I had learned one of the more mature web development frameworks such as CakePHP, CodeIgniter, or, if I had been writing my code in Python, Django or Pylons. There are also numerous Java based frameworks that I could have used. Now my code is lacking in a lot of the features of the aforementioned frameworks but, and this is the important part, I didn't need to read a tutorial or learn how to develop with my framework because it evolved with my application.
how it started
One of the requirements of the site I was developing was clean/beautified URLs. I didn't want to ever have a URL like www.example.com/something.php?foo=jezra&bar=lickter. It sure would be nicer is the URL was www.example.com/something/jezra/lickter. On a Windows IIS server this can be done with either a third party rewrite plugin or with a very well crafted 404 script. However, since most webserver hosts don't allow for third party plugins and I had a devil of a time with the last few custom 404 IIS script that I wrote, I decide to go with the Apache webserver which has a built mechanism for rewriting URLs.
There seemed to be two ways to rewrite the URLs. The first way was to break apart the URL and into a list of the page and it's arguments and let the page decide what to do with the request, thus www.example.com/something/jezra/lickter would get directly rewritten as www.example.com/something.php?foo=jezra&bar=lickter and something.php would handle the request. The second way, which I decided to use, was to have a controlling script that would take all of the arguments of the URL and load a specific class and pass arguments to the class. In this way, I could define a default class that has the capabilities to handle database connections as well as handling text output to a template system for user interaction. Thus a request for www.example.com/something/jezra/lickter would actually load index.php and the index script would make a new instance of the "something" control class and pass the class the arguments "jezra" and "lickter". Upon creating the "something" instance, the parent class gets initialized and it is the parent class that handles the templating system and database connections.
What is being used
To handle the template system, I am using Tiny But Strong. Actually, the parent control class loads a wrapper class that in turn handles the tinybutstrong template. Through the extension of the parent control class, the individual controls are able to set template variable as well as pass arrays for duplication of tabular data. By using a wrapper class to handle the template system, I will be able to change template systems in the future, without having to rewrite the code for the various controls, by rewriting the code of the template wrapper.
Currently I handle database abstraction through a PDO (PHP Data Object) in the parent control class and this works fine for my needs. Moving forward, I should take greater advantage of the various PDO functions.
What's Missing
Lots of things.
What bugs me
If and when I need to make asynchronous javascript calls, I have been using PHPLiveX to creation of the browser specific javascript code and I would prefer for the server not to have to rewrite the URL, parse code, and process a template everytime there is an AJAX call to the server.
Basic Layout
Why
Why not? It was a fun and useful learning experience.
I didn't want to write a web framework, and in hindsight, it probably would have been better if I had learned one of the more mature web development frameworks such as CakePHP, CodeIgniter, or, if I had been writing my code in Python, Django or Pylons. There are also numerous Java based frameworks that I could have used. Now my code is lacking in a lot of the features of the aforementioned frameworks but, and this is the important part, I didn't need to read a tutorial or learn how to develop with my framework because it evolved with my application.
how it started
One of the requirements of the site I was developing was clean/beautified URLs. I didn't want to ever have a URL like www.example.com/something.php?foo=jezra&bar=lickter. It sure would be nicer is the URL was www.example.com/something/jezra/lickter. On a Windows IIS server this can be done with either a third party rewrite plugin or with a very well crafted 404 script. However, since most webserver hosts don't allow for third party plugins and I had a devil of a time with the last few custom 404 IIS script that I wrote, I decide to go with the Apache webserver which has a built mechanism for rewriting URLs.
There seemed to be two ways to rewrite the URLs. The first way was to break apart the URL and into a list of the page and it's arguments and let the page decide what to do with the request, thus www.example.com/something/jezra/lickter would get directly rewritten as www.example.com/something.php?foo=jezra&bar=lickter and something.php would handle the request. The second way, which I decided to use, was to have a controlling script that would take all of the arguments of the URL and load a specific class and pass arguments to the class. In this way, I could define a default class that has the capabilities to handle database connections as well as handling text output to a template system for user interaction. Thus a request for www.example.com/something/jezra/lickter would actually load index.php and the index script would make a new instance of the "something" control class and pass the class the arguments "jezra" and "lickter". Upon creating the "something" instance, the parent class gets initialized and it is the parent class that handles the templating system and database connections.
What is being used
To handle the template system, I am using Tiny But Strong. Actually, the parent control class loads a wrapper class that in turn handles the tinybutstrong template. Through the extension of the parent control class, the individual controls are able to set template variable as well as pass arrays for duplication of tabular data. By using a wrapper class to handle the template system, I will be able to change template systems in the future, without having to rewrite the code for the various controls, by rewriting the code of the template wrapper.
Currently I handle database abstraction through a PDO (PHP Data Object) in the parent control class and this works fine for my needs. Moving forward, I should take greater advantage of the various PDO functions.
What's Missing
Lots of things.
What bugs me
If and when I need to make asynchronous javascript calls, I have been using PHPLiveX to creation of the browser specific javascript code and I would prefer for the server not to have to rewrite the URL, parse code, and process a template everytime there is an AJAX call to the server.
Basic Layout
- index.php // the main controller that handles all rewritten URLs
- controls // a directory that contains all of the includable controls, all controls are named after the first part of a requested URL
- javascripts // a directory of javascript files, all files are named after the first part of a requested URL
- styles // a directory of css files, all files are named after the first part of the requested URL
- templates // a directory of templates that are filled with data and sent to the users browser
- classes // a directory of class for creating controllers, templates, and database connections
- favico.ico // an icon to send to the web browser
- config.php // a php file that contains configurable information for database access and file path location
- .htaccess // an Apache file that contains rewrite rules
Why
Why not? It was a fun and useful learning experience.
Comments
2008-06-08
This is sample code and a proof of concept, if you are interested in a well crafted webserver written with PHP, I suggest Nanoweb.
Why the hell would one want to write a webserver in PHP?
Well, for the same reason one would write a proof of concept webserver in any other programming language; it needed to be done. Actually, this started out as a test of using sockets for inter-application communication on a network and was written to run using PHP4 on a Macintosh. Since I could either delete the file and be done with it, or share the file and maybe help someone, I'm opting for the share and help.
Here you go.....
Geez, I really don't like looking at my old code since it always makes me think about how I would re-write differently if I were to write it today, and this code is about 3-4 years old.
Why the hell would one want to write a webserver in PHP?
Well, for the same reason one would write a proof of concept webserver in any other programming language; it needed to be done. Actually, this started out as a test of using sockets for inter-application communication on a network and was written to run using PHP4 on a Macintosh. Since I could either delete the file and be done with it, or share the file and maybe help someone, I'm opting for the share and help.
Here you go.....
#!/usr/bin/env php -q
<?php
//function get_include_contents
// input : path to a php file
// output : the parsed text of the php file
function get_include_contents($filename)
{
echo "parsing: $filenamen";
if (is_file($filename))
{
ob_start();
include $filename;
$contents = ob_get_contents();
ob_end_clean();
return $contents;
}
return false;
}
function get_file_extension($file)
{
preg_match("/.([a-z0-9.]*)/i",$file,$matches);
$file_extension=$matches[1];
return $file_extension;
}
function get_path_to_file($request)
{
global $pwd;
global $default_http_dir;
return $pwd."/".$default_http_dir.$request;
}
function get_referer($text)
{
preg_match("/10000(/S*)/i",$text,$matches);
$referer = $matches[1];
return $referer;
}
//path to php
$php_path="/usr/local/php5/bin/php";
echo"starting server...n";
//we want to run indefinitely
set_time_limit(0);
//we need to output info to the terminal as soon as we can
ob_implicit_flush();
//define a bunch of variables
//what address are we binding to?
$bind_address='127.0.0.1';
//access port
$port=10000;
//what is the default file?
$default_array= array("index.php","index.htm","index.html");
//set the default dir for files
$default_http_dir = "webroot";
//where are we?
$pwd=trim(`pwd`);
//start connecting sockets
$socket_resource=socket_create(AF_INET,SOCK_STREAM,0) or die("socket_create()failed.");
socket_bind($socket_resource,$bind_address,$port) or die("socket_bind()failed");
socket_listen($socket_resource,5) or die("socket_listen()failed");
echo "Server is up and running ....n";
do //this is the MAIN loop
{
if(($accept_socket_resource=socket_accept($socket_resource))<0)
{
echo "socket_accept() failedn";
break;
}
if(FALSE===($buffer_text=socket_read($accept_socket_resource,2048)))
{
echo "socket_read() failedn";
break;
}
preg_match("/ .+ /i",$buffer_text,$matches);
$full_request = trim($matches[0]);
$request_and_arguments = explode("?",$full_request);
$request = urldecode( $request_and_arguments[0] );
$arguments=$request_and_arguments[1];
// if we have some get vars, lets make a $_GET[] array set
if($arguments)
{
$args_n_vars = explode("&",$arguments);
foreach($args_n_vars as $arg_n_var)
{
list($var,$value) = explode("=",$arg_n_var);
$_GET[$var]=urldecode($value);
}
}
if(substr($request,strlen($request)-1,1)=="/")
{
//find the first default file
foreach($default_array as $default_file)
{
$tfile = stripslashes($pwd)."/".$default_http_dir."/".$default_file;
if( file_exists($tfile) )
{
$request = "/$default_file";
break;
}
}
}
$file_extension = get_file_extension($request);
$path_to_file = get_path_to_file($request);
//what mime type should be returned?
switch($file_extension)
{
case "png":
$mime="image/png";
break;
case "gif":
$mime="image/gif";
break;
case "jpg":
$mime="image/jpeg";
break;
case "zip":
$mime="aplication/zip";
break;
case "exe":
$mime="aplication/exe";
break;
case "bmp":
$mime="image/bmp";
break;
case "mov":
$mime="video/quicktime";
break;
case "mp3":
$mime="video/mpeg";
break;
case "mpeg":
$mime="video/mpeg";
break;
case "txt":
$mime="text/plain";
break;
default:
$mime="text/html";
break;
}
if(file_exists($path_to_file))
{
//we have a file to return to the user
$return_file = true;
//is this a php file?
if($file_extension == "php")
{
//we had better parse the php
//where are we now?
$start_dir= getcwd();
//cd to the php files dir
$php_file_dir = dirname( $path_to_file);
$php_file = basename( $path_to_file );
chdir($php_file_dir);
//run the included PHP file
$file_contents = get_include_contents("$php_file");
$file_length=strlen($file_contents);
//return to the start dir
chdir($start_dir);
}else{
echo "serving: $path_to_file";
$file_contents = file_get_contents($path_to_file);
$file_length=filesize($path_to_file);
}
if($return_file)
{
$today=date("m.d.y");
$output="HTTP/1.0 200 OKrnServer: PHP Proof of ConceptrnDate:$todayrnConnection:closernContent-Length:$file_lengthrnContent-Type:$mimernrn";
socket_send($accept_socket_resource,$output,strlen($output),0);
$start_chunk=0;
while($start_chunk<=$file_length)
{
$text_chunk=substr($file_contents,$start_chunk,2048);
socket_write($accept_socket_resource,$text_chunk,strlen($text_chunk));
$start_chunk+=2048;
}
}
}else{
$output="HTTP/1.0 404 OBJECT NOT FOUNDrnServer: PHP ServerrnConnection: closernrn";
socket_write($accept_socket_resource,$output,strlen($output));
}
socket_close($accept_socket_resource);
}while(true);
socket_close($socket_resource);
?>
Geez, I really don't like looking at my old code since it always makes me think about how I would re-write differently if I were to write it today, and this code is about 3-4 years old.
Comments
He probably still does all of his administrative scripting in PHP, I bet. It didn't seem like a bad way to do business. :)