Customize your website for Different Locales
Convert IP Addresses to Country Names for Web Page Localization
Introduction
Localization is one of the watch-words of the watchwords of the modern internet. The big sites like Google seem to do this almost effortlessly. They know where you live... Or, at least they know which country your ISP is from and tailor their services and the advertising they provide accordingly. Have you ever wondered how they do this? This tutorial will show you how it's done and will even enable you to provied the same kind of services for your own website! Have you thought of what you could do if you could target your content with some accuracy to varous people around the world. No longer would you be tied to a single market, you can finally leverage the true power of the internet to sell globally!
Like many web deveolpers I came into this area because I wanted to set-up stores that linked into both eBay and Amazon but that appealed to more than one market. The major driver was my eBay Misspelling Search Tool which allows users from most English-speaking territories to use the misspelling generation tool to search their local eBay stores for products that have been listed under incorrect spellings. This necessitates allowing for a number of countries (US, UK, Canada, Australia, India, Ireland). Originally I wrote this defaulting to the USA with the users chosing their country. But this was clunky and I wanted the interface to be as 'neat' as possible. As a result I developed this country assignment algorithm to automatically fill the correct country so that users automatically saw the results for their particular territories. For my Amazon Search and Price Comparison page I created an equivalent system using PERL to directe visitors to the appropriate localized version of Amazon.
As an example, look at the table below. It will show you your IP address as well as the country that IP maps to and the flag corresponding to the country. This was all done automatically as soon as you navigated to this page:
Your IP Information |
|---|
| Your IP address: 38.103.63.62 |
| Your numberical IP address: 644300606 |
| Your country name: UNITED STATES |
| Your country code: US |
As demonstrated in the table above I can actually glean a fair amount of information about your IP address quite easily. It's then possible to manipulate this information to do useful or interesting things (such as displaying the flag in the example above). Now that you've seen an example, I'll take you through precisely how the code was written so that you can, if you wish, do the same kind of thing for yourself.
Putting it all together:
Designing your own country identification code
Page Map |
|
|---|---|
| Copyright and Header | Querying the Database |
| Overview: | Conclusion |
| Obtaining the Data | Join the Mailing List |
| Parsing the File |
Copyright and Header | ||||||||
|
COPYRIGHT NOTICE:
Copyright 2005 – 2006 Dyfed Lloyd Evans, all rights reserved.
The scripts detailed on this page may be used and modified free of charge by anyone as long as this copyright notice and the comments above remain intact. By using this code you agree to indemnify Dyfed Lloyd Evans from any liability that might arise from their use.
This code is released under the GPL license. Selling the code for this program without the prior written consent of the author is expressly forbidden. Obtain permission before redistributing this software over the Internet or in any other medium. In all cases copyright and header must remain intact.
If you find this code useful and employ it or modify it for your own site, please include a link to this page http://www.celtnet.org.uk/info/IP-to-country-converter.php on your links page.
Overview:
The process of converting IP addresses to a country name is fairly simple. It's a question of obtaining a list of IP addresses for each country, loading these into a database (MySQL) and using this to look-up the IP address of anyone who accesses your web pages. You should then know what country your visitors come from and you can do useful things with that information. The one-sentence summary is fairly simple, but it's a little bit more complicated to actually turn this into something useful. In the explanations below I shall go into the entire process in some detail.
Obtaining the Data:
The allocaton of IP addresses are handled by five so-called Regional Internet Registries (RIRs) which are RIPE NCC (Europe), ARIN (North Americ and part of the Caribbean), LACNIC (Latin America and Caribbean), APNIC (Asia Pacific) and AfrINIC (Africa). Each of these publish a daily list of all the registry data. Thankfully all the files generated are available via anonymous FTP and the files produced by all the repositories are of the same formats so they can be added together and analyzed with the same code. To give you an example of what you're dealing with, here is a snippet of the European data file:
ripencc|GR|ipv4|62.1.0.0|65536|20000216|allocated
ripencc|CH|ipv4|62.2.0.0|65536|19981211|allocated
ripencc|SA|ipv4|62.3.0.0|8192|20000721|allocated
ripencc|SA|ipv4|62.3.32.0|8192|20020109|allocated
ripencc|GB|ipv4|62.3.64.0|16384|20010629|allocated
ripencc|SE|ipv4|62.3.128.0|8192|20001005|allocated
ripencc|PL|ipv4|62.3.160.0|8192|20020114|allocated
ripencc|GB|ipv4|62.3.192.0|16384|20030212|allocated
ripencc|FR|ipv4|62.4.0.0|8192|19970513|allocated
Personally I use a perl wrapper around the FTP application ncftp to fetch the data (you can find the FTP sites on this page [look down the left hand side for the list of sites]). What I do is feed the FTP addresses into an array then iterate through each one to fetch the data fles (each datasource has a link to the latest datafile named delegated-*-latest). Once downloaded the file is parsed to get the data I need and this parsed data is input into a single file for later uploading to the database. I then check the number of lines in the opuput file to make sure there are more than 65 000 (currently there are over 75 000). This ensures that there was no problem with the downloaded files of the FTP process. If all is well I remove the previous day's files (which have been zipped) and I zip up the current day's files. The data centres update their files on a daily basis and it's worthwhile updating at least twice a week as the larger files can fluctuate by several hundred entries.
As you can see from the example above the data format is relatively simple and easy to parse as the data is delimited by the pipe "|" character.
COLUMN VALUES
---------------------------------------------------------------------
REGISTRY: apnic,arin,ripencc,lacnic,iana
COUNTRY_CODE: One of 240 unique 2-character country codes or "*" or "ZZ" (unassigned)
ADDRESS_TYPE: asn,ipv4,ipv6
ADDRESS: Either the starting IP Address or AS Number or "*"
NUMBER: Number of IPs in range or "1" if ADDRESS_TYPE is "asn"
DATE: Date IP range or AS Number was added to database or "*"
RANGE_TYPE: "allocated" -> borrowed; "assigned" -> owned
As it happens we're only interested in the ipv4 subset of the data so it's possible to disscard a significant portion of the data. First, however, you'll need two databases — a database for the ipv4 data and a database that maps the two-character country code to a country name.
The basic ipv4 database:
CREATE TABLE ip_maps (
code char(2) default NULL,
registry char(10) default NULL,
ip_from double default NULL,
ip_to double default NULL,
UNIQUE KEY registry (registry,ip_from,ip_to)
);
The country code database:
CREATE TABLE country_codes (
code char(2) default NULL,
country varchar(50) default NULL,
UNIQUE KEY code (code)
);
For my own database I have an additional column width int that describes the width of the flags I display on this page (even if flag sizes are altered to be the same heights the widths vary so this is necessary).
Parsing the File:
The PERL script given below performs all the processing for the data. The method I use to fetch the data via FTP has been excluded from the top of the file, as I'm fairly certain that you can roll your own.
#!/usr/bin/perl
###################################################
# common start variables
##################################################
use DBI;
use utf8;
###########################################
# database connection variables
###########################################
$srcdb = 'fpv4_db';
$dbuser = 'foo';
my $dbpass = 'bar';
my $hst = 'my.host.string';
###########################################
##################################
# The following variable allows the country code to name
# database to be dropped and re-constructed if, for some reason
# the country code dataset has changed
############################################
$drop = $ARGV[0];
####################################################
# At this point I perform the FTP downloads
# add your own code to do this here
####################################################
####################################################
# This perl script assumes that all the data files
# and downloaded files are in the same directory as this
# script
#########################################################
opendir(DIR,".");
foreach my $file (readdir(DIR))
{
if ($file =~ /latest$/)
{
push (@filelist,$file);
}
}
#########################################
# Check that all the files are available
#########################################
if (scalar(@filelist < 5)
{
print "One of the input files is missing... Please run the FTP downloads again\n";
exit;
}
##########################################################
# Create the output file
##########################################################
open(OUTFILE,">ipv4_data.csv") || die "Cannot open output file\n";
foreach $fil (@filelist)
{
open(INFILE,"$fil");
print "Processing $fil\n";
while ($line =
This scipt relies on your having created a tab-separated country name to country ID mapping file named ctry_list.lis of the format:
AFGHANISTAN AF
ALBANIA AL
ALGERIA DZ 36
AMERICAN SAMOA AS
ANDORRA AD
ANGOLA AO
ANGUILLA AI
The data presented above is based on the recognized international standard for country codes, ISO 3166 which you should download and parse to create the lookup file. The ids all correspond with the exception of the IDs for the United Kingdom where the official symbol is UK but Ipv4 uses GB for historical reasons. The easiest way to overcome this is to add an extra entry: UNITED KINGDOM GB to the parsed file.
You'll note that the IP adresses in the registry download files are of the form 202.127.4.0 (ie four numbers separated by periods '.'). IPv4 uses 32-bit (4-byte) addresses, which limits the address space to 4,294,967,296 possible unique addresses. However, as many of these are reserved the actual number of available addresses is much lower. The example given on the left represents a dot-decimal notation which comprises four octets in decimal separated by periods. Everyone is familiar with this notation, however it's not very useful if you need to find a number within an IP range (the repositories provide us with IP ranges rather than a list of numbers). To convert from decimal-coded octets to a single decimal number the following process is used (eg. for an IP 1.2.3.4) 1.2.3.4 = 4 + (3 * 256) + (2 * 256 * 256) + (1 * 256 * 256 * 256). The numbers are split and reversed then each is multiplied by 256n with n in the range 0 to 3. This results in a pure decimal number.
If you're using PERL then the following code will perform the various conversions for you:
IP Address -> IP Number Conversion
my (@octets,$octet,$ip_number,$number_convert,$ip_address);
$ip_address = $ARGV[0];
chomp ($ip_address);
@octets = split(/\./, $ip_address);
$ip_number = 0;
foreach $octet (@octets) {
$ip_number <<= 8;
$ip_number |= $octet;
}
print "The IP Address: $ip_address converts to the following IP Number: $ip_number\n";
IP Number -> IP Address Conversion
my (@octets,$i,$ip_number,$ip_number_display,$number_convert,$ip_address);
$ip_number_display = $ip_number = $ARGV[0];
chomp($ip_number);
for($i = 3; $i >= 0; $i--) {
$octets[$i] = ($ip_number & 0xFF);
$ip_number >>= 8;
}
$ip_address = join('\.', @octets);
print "The IP Number $ip_number_display converts to the IP Address $ip_address\n";
If you're using PHP then it's even easier:
IP Address -> IP Number Conversion
$ip_number = sprintf("%u",ip2long($ip_address));
echo "The IP Address: ".$ip_address." converts to the following IP Number: ".$ip_number."<br/>";
IP Number -> IP Address Conversion
$ip_address = long2ip($ip_number);
echo "The IP Number ".$ip_number." converts to the IP Address ".$ip_address."<br/>";
Above you've just found out how to derive the decimal representation for an IP address. The Ipv4 data, however, comes as a block of IP addresses with the first address given in octet format and the extent of that block given as a decimal number from 1 to 256. Thus the end of a block is simply given by adding the block start to the length and subtracting 1 (this needs to be done as the block start is considered the first number in the block). With the block ranges converted to decimal it's now fairly easy to query the database and obtain some useful information.
Querying the Database:
If the code ran properly with no errors, then you will now have two populated datbases and you can start to do fun things like running queries across the database. For example:
select cc.country, cc.code, ip.registry, ip.ip_from, ip.ip_to from country_codes cc, ip_maps ip
where cc.code = ip.code and ip.code = "GL";
This gives us the following result:
+-----------+------+----------+------------+------------+
| country | code | registry | ip_from | ip_to |
+-----------+------+----------+------------+------------+
| GREENLAND | GL | ripencc | 1481834496 | 1481842687 |
| GREENLAND | GL | ripencc | 3266437120 | 3266445311 |
+-----------+------+----------+------------+------------+
2 rows in set (0.05 sec)
That's fine for a basic query, but what about doing something useful? I'll now give you some real code that takes an incoming IP address and works out the country of origin. This is PHP code and it employs some fairly clever clever initial comparisons to check the validity of the incooming IP address:
#first the settings to link to the database
$user="foo";
$pwd="bar";
$host="my.mysql.host";
$database="ipv4_db";
#now connect to the database
mysql_connect($host,$user,$pwd);
@mysql_select_db($database) or die( "Unable to select database");
$add;
#fetch the incoming IP address and perform some checks to make sure
#that this is valid
if (getenv("HTTP_CLIENT_IP")) $add = getenv("HTTP_CLIENT_IP");
else if(getenv("HTTP_X_FORWARDED_FOR")) $add = getenv("HTTP_X_FORWARDED_FOR");
else if(getenv("REMOTE_ADDR")) $add = getenv("REMOTE_ADDR");
else $add = "UNKNOWN";
#now check that the IP address is valid
$cmp = strcmp($add,"UNKNOWN");
#if the IP address is valid
if ($cmp != 0)
{
#convert the octets into useful decimal
$ip = sprintf("%u", ip2long($add));
#now create the query. Here we're checking to find which IP range the incoming IP
#address belongs to
$query = "SELECT cc.country as ctry, cc.code as tlc, ip.registry as rgst from country_codes cc,
ip_maps ip where cc.code = ip.code and ".$ip." >= ip_from and ".$ip." <= ip_to";
$result = mysql_query($query) or die('Error, query failed');
while($row = mysql_fetch_array($result))
{
echo "Your IP address: ".$add." originates in ".$result[ctry].", code: ".$result[tlc]."
from registry ".$result[rgst].">br/<";
}
}
else
{
#something's wrong with the IP address it's either not valid or is being spoofed
echo "Your stated IP address is invalid... Are you trying to spoof me?";
}
The code above fetches the IP address of a visitor (after doing some checking to make sure that it's valid) and then grabs the country representing that IP address. It even warns the user of an invalid IP address that they're probably trying to spoof you.
If, however, you're more into PERL than PHP then the code below will perform exactly the same funtion, but in PERL stylee:
#!/usr/bin/perl
use DBI;
use utf8; #you probably don't use this but I employ a lot of UTF-8 encoding in my databases
#first the settings to link to the database
my $srcdb = 'ipv4_db';
my $dbuser = 'foo';
my $dbpass = 'baar';
my $hst = 'my.mysql.host';
my $add;
##################################################
# now initiate the database connection
##################################################
my $dsn = "DBI:mysql:$srcdb:$hst:$prt";
my $dbh = DBI->connect( $dsn, $dbuser, $dbpass) or die "Can't connect to $dsn: $dbh->errstr\n";
##################################################
## Now check for the IP address
##################################################
if ($ENV{'HTTP_CLIENT_IP'}){ $add = $ENV{'HTTP_CLIENT_IP'};}
elsif($ENV{'HTTP_X_FORWARDED_FOR'}){ $add = $ENV{'HTTP_X_FORWARDED_FOR'};}
elsif($ENV{'REMOTE_ADDR'}){ $add = $ENV{'REMOTE_ADDR'};}
else{ $add = "UNKNOWN"};
###################################################
## Now it's possible to do the conversion
###################################################
if ($add ne "UNKNOWN")
{
my (@octets,$octet,$ip_number,$number_convert,$ip_address);
$ip_address = $add;
chomp ($ip_address);
@octets = split(/\./, $ip_address);
$ip_number = 0;
foreach $octet (@octets) {
$ip_number <<= 8;
$ip_number |= $octet;
}
print "The IP Address: $ip_address converts to the following IP Number: $ip_number\n";
$sql = "SELECT cc.country as ctry, cc.code as tlc, ip.registry as rgst from country_codes cc,
ip_maps ip where cc.code = ip.code and ".$ip." >= ip_from and ".$ip." <= ip_to";
$sth = $dbh->prepare($sql);
$sth->execute;
($ctry,$tlc,$registry) = $sth->fetchrow_array();
print "Your IP address: $add originates in $ctry, code $tlc\n";
}
else
{
print "The IP for your are navigating from is not recognized. Are you trying to spoof me?\n";
}
Conclusion:
The information above will allow you to easily replicate the IP to country name lookup used at the very top of this page. It should take you about an hour to put all the code together. Just think, some companies out there (who will remain nameless) are charging up to $99 a month for an IP lookup database and some supporting code. Why pay that when I've shown you precisely how to do it for yourself.
|
If you enjoyed this page and would like to get more tips, tricks and offers to help you make the most of your most of your web presence please sign up for my Weekly e-mail newsletter. Please note that your details will never be sold and shared with others. You are signing-u for my e-mail only. |

Help Stefan Campaign

