Performance of Perl substr() and index()

This section provides a tutorial example to test performance of 2 Perl built-in functions: substr() and index() on a Window 2000 system.

I wrote the following program, SubstringTest.pl, in 2002 on a Windows 2000 system to check the performance of Perl built-in functions: substr() and index():

#- SubstringTest.pl
#- Copyright (c) 2002 by Dr. Herong Yang, http://www.herongyang.com/
#
   ($numChar, $numTest) = @ARGV;
   $numChar = 10 unless $numChar;
   $numTest = 1 unless $numTest;
   $baseString = &setString($numChar);
   $subString = &setString($numChar);
   $startTime = time();
   $numMatch = 0;
   for ($i=0; $i<$numTest; $i++) {
      $numMatch = &test();
   }
   $endTime = time();
   $totalTime = $startTime - $endTime;
   $averageTime = $totalTime/$numTest;
   print("Number of tests = $numTest\n");
   print("Number of characters = $numChar\n");
   print("Number of matches = $numMatch\n");
   print("Total time = $totalTime seconds\n");
   print("Average time = $averageTime seconds\n");
   exit;
sub setString {
   local($size) = @_;
   local $str = "";
   local($i,$n,$c);
   for ($i=0; $i<$size; $i++) {
      $n = int(rand(96)) + 32;
      $c = chr($n);
      $str .= $c;
   }
   return $str;
}
sub test {
   local($i,$j,$l);
   local $str = "";
   local $num = 0;
   local $pos = -1;
   for ($i=0; $i<$numChar; $i++) {
      $l = $i+1;
      for ($j=0; $j<$numChar-$i; $j++) {
         $str = substr($subString,$j,$l);
         $pos = index($baseString,$str);
         $num++ if ($pos<0);
      }
   }
   return $num;
}

The idea is to get two strings of the same length randomly, take all possible substrings out of the first one, and try to match them in the other one with index().

Running this program with ActivPerl v5.6.1 on Windows 2000 system gave me:

>SubstringTest.pl 50 1000
Number of tests = 1000
Number of characters = 50
Number of matches = 1257
Total time = 8 seconds
Average time = 0.008 seconds

>SubstringTest.pl 100 1000
Number of tests = 1000
Number of characters = 100
Number of matches = 4976
Total time = 31 seconds
Average time = 0.031 seconds

>SubstringTest.pl 200 1000
Number of tests = 1000
Number of characters = 200
Number of matches = 19916
Total time = 137 seconds
Average time = 0.137 seconds

Table of Contents

 About This Book

 Perl on Linux Systems

 ActivePerl on Windows Systems

 Data Types: Values and Variables

 Expressions, Operations and Simple Statements

 User Defined Subroutines

 Perl Built-in Debugger

 Name Spaces and Perl Module Files

 Symbolic (or Soft) References

 Hard References - Addresses of Memory Objects

 Objects (or References) and Classes (or Packages)

 Typeglob and Importing Identifiers from Other Packages

String Built-in Functions and Performance

 String Related Built-in Functions

Performance of Perl substr() and index()

 Performance of Java substring() and indexOf()

 File Handles and Data Input/Output

 Open Files in Binary Mode

 Open Directories and Read File Names

 File System Functions and Operations

 Converting Perl Script to Executable Binary

 Using DBM Database Files

 Using MySQL Database Server

 Socket Communication Over the Internet

 XML::Simple Module - XML Parser and Generator

 XML Communication Model

 SOAP::Lite - SOAP Server-Client Communication Module

 Perl Programs as IIS Server CGI Scripts

 CGI (Common Gateway Interface)

 XML-RPC - Remote Procedure Call with XML and HTTP

 RPC::XML - Perl Implementation of XML-RPC

 Integrating Perl with Apache Web Server

 CGI.pm Module for Building Web Pages

 LWP::UserAgent and Web Site Testing

 References

 PDF Printing Version