=encoding utf8 =head1 NAME Geo::libpostal - Perl bindings for libpostal =head1 SYNOPSIS use Geo::libpostal ':all'; # normalize an address my @addresses = expand_address('120 E 96th St New York'); # parse addresses into their components my %address = parse_address('The Book Club 100-106 Leonard St Shoreditch London EC2A 4RH, United Kingdom'); # %address contains: # ( # road => 'leonard st', # postcode => 'ec2a 4rh', # house => 'the book club', # house_number => '100-106', # suburb => 'shoreditch', # country => 'united kingdom', # city => 'london' # ); =head1 DESCRIPTION libpostal is a C library for parsing/normalizing international street addresses. Address strings can be normalized using C which returns a list of valid variations so you can check for duplicates in your dataset. It supports normalization in over L<60 languages|https://github.com/openvenues/libpostal/tree/master/resources/dictionaries>. An address string can also be parsed into its constituent parts using C such as house name, number, city and postcode. =head1 FUNCTIONS =head2 expand_address use Geo::libpostal 'expand_address'; my @ny_addresses = expand_address('120 E 96th St New York'); my @fr_addresses = expand_address('Quatre vingt douze R. de l\'Église'); Takes an address string and returns a list of known variants. Useful for normalization. Accepts many boolean options: expand_address('120 E 96th St New York', latin_ascii => 1, transliterate => 1, strip_accents => 1, decompose => 1, lowercase => 1, trim_string => 1, drop_parentheticals => 1, replace_numeric_hyphens => 1, delete_numeric_hyphens => 1, split_alpha_from_numeric => 1, replace_word_hyphens => 1, delete_word_hyphens => 1, delete_final_periods => 1, delete_acronym_periods => 1, drop_english_possessives => 1, delete_apostrophes => 1, expand_numex => 1, roman_numerals => 1, ); B: old versions of libpostal L if all options are set to false. Also accepts an arrayref of language codes per L: expand_address('120 E 96th St New York', languages => [qw(en fr)]); This is useful if you are normalizing addresses in multiple languages. Finally C accepts an option for which address C to expand. This a 16bit integer bitmask. These constants are exported with the C<:all> tag: $ADDRESS_NONE $ADDRESS_ANY $ADDRESS_NAME $ADDRESS_HOUSE_NUMBER $ADDRESS_STREET $ADDRESS_UNIT $ADDRESS_LOCALITY $ADDRESS_ADMIN1 $ADDRESS_ADMIN2 $ADDRESS_ADMIN3 $ADDRESS_ADMIN4 $ADDRESS_ADMIN_OTHER $ADDRESS_COUNTRY $ADDRESS_POSTAL_CODE $ADDRESS_NEIGHBORHOOD $ADDRESS_ALL These are the default components used by libpostal: use Geo::libpostal ':all'; expand_address('120 E 96th St New York', components => $ADDRESS_NAME | $ADDRESS_HOUSE_NUMBER | $ADDRESS_STREET | $ADDRESS_UNIT ); The constant C<$ADDRESS_ALL> uses all components: expand_address('120 E 96th St New York', components => $ADDRESS_ALL ); C will C on C and empty addresses, odd numbers of options and unrecognized options. Exported on request. =head2 parse_address use Geo::libpostal 'parse_address'; my %ny_address = parse_address('120 E 96th St New York'); my %fr_address = parse_address('Quatre vingt douze R. de l\'Église'); =cut =pod Will C on C and empty addresses. Exported on request. C may return L for invalid addresses strings. =head1 WARNING libpostal uses C and C functions. Setup is lazily loaded. Teardown occurs in an C block automatically. C will C if C or C is called after teardown. =head1 EXTERNAL DEPENDENCIES L is required. =head1 INSTALLATION You can install this module with CPAN: $ cpan Geo::libpostal Or clone it from GitHub and install it manually: $ git clone https://github.com/dnmfarrell/Geo-libpostal $ cd Geo-libpostal $ perl Makefile.PL $ make $ make test $ make install =head1 AUTHOR E 2016 David Farrell =head1 LICENSE See LICENSE =cut