PERL (Practical Extration and Report Language) Quick References
FIZZLE A filehandle or directory handle.
$FIZZLE A scalar variable.
@FIZZLE An array indexed by number.
%FIZZLE An array indexed by string.
&FIZZLE Subroutine
*FIZZLE Everything named FIZZLE.
Special Variables:
$a = <STDIN> sets $a to the next line of standard input.
@a = <STDIN> sets @a to all the rest of the input lines.
<ARGV> Filehandles supply from command line.
@ARGV Lists of filenames from command line.
$ARGV Hold the name of the current filename.
$#ARGV Hold the last element of the array $ARGV. (base is 0)
$_ Hold the current line.
$& Hold the text of what you matched.
$` Hold everything before the match.
$' Hold everything after the match.
$@ *****Hold everything after the match.********
$[ Hold the current array base, ordinarily 0.
Operator Precedence: Lowest to highest
Associativity Operators
nonassoc The list operators (eg. print,sort,chmod)
left ,
right =,+=,-=,*=, etc.
right ?:
nonassoc ..
left ||
left &&
left |^
left &
nonassoc ==,!=,<=>,eq,ne,cmp
nonassoc <>,<=,>=,lt,gt,le,ge
nonassoc The named unary operators (eg. chdir)
nonassoc -r,-w,-x, etc.
left <<,>>
left +,-,.
left *,/,%,x
left =~,!~
right ~,~,and unary minus
nonassoc ++,--
left '('
Scalar Operators:
Pattern Matching
!~ Not Match
(i.e. $a !~ /pat/ "true if $a not contains pattern").
=~ Match
(i.e. $a =~ /pat/ "true if $a contains pattern").
=~ Substitution
(i.e. $a =~ s/p/r/ "replace occurrences of p with r in $a").
=~ Translation
(i.e. $a =~ tr/a-z/A-Z/).
Logical Operators
$a && $b Add (true if $a is true and $b is true).
$a || $b Or ($a if $a is true otherwise $b).
! $a Not (true if $a is not true).
Arithmetic Operators
$a + $b Add
$a - $b Subtract
$a * $b Multiply
$a / $b Divide
$a % $b Modulus
$a ** $b Exponentiate
++$a,$a++ Autoincrement
--$a,$a-- Autodecrement
rand($a) Random
String Operations
$a . $b Concatenation
$a * $b Repeat (value of $a strung together $b times)
substr($a,$o,$l) Substring (Substring at offset $o of length $l)
index($a,$b) Index (Offset of string $b in string $a)
Assignment Operations
$a = $b Assign $a gets the value of $b
$a += $b Add to Increase $a by $b
$a -= $b Substract from Decrease $a by $b
Test Operations
Numberic String Meaning
== eq Equal to
!= ne Not Equal to
> gt Greater than
>= ge Greater than or equal to
< lt Less than
<= le Less than or equal to
<> cmp Not equal to, with singed return
File Operations
-r $a File is readable by effective uid.
-R $a File is readable by real uid.
-w $a File is writable by effective uid.
-W $a File is writable by real uid.
-x $a File is executable by effective uid.
-X $a File is executable by real uid.
-o $a File is owned by effective uid.
-O $a File is owned by real uid.
-e $a File exists.
-z $a File has zero size.
-s $a File has non-zero size (returns size in bytes).
-f $a File is a plain file.
-d $a File is a directory.
-l $a File is a symbolic link.
-p $a File is a named pipe (FIFO).
-S $a File is a socket.
-b $a File is a block special file.
-c $a File is a character special file.
-u $a File has setuid bit set.
-g $a File has setgid bit set.
-k $a File has sticky bit set.
-t $a Filehandle is opened to a tty.
-T $a File is a text file.
-B $a File is a binary file.
-M $a Age of file (at startup) in days since modification.
-A $a Age of file (at startup) in days since last access.
-C $a Age of file (at startup) in days since inode change.
open(MYFILE,"> myfilename"); Create file.
open(MYFILE,">> myfilename"); Append to file.
open(MYFILE,"| output-pipe-command"); set up output filter.
open(MYFILE,"input-pipe-command|"); set up input filter.
Named Unary Operations
alarm getprotobyname log sin
chdir gethostbyname ord sleep
cos getnetbyname oct sqrt
chroot gmtime require srand
exit hex reset umask
eval int rand exp
length rmdir getpgrp localtime
$a = <STDIN>; sets @a to next input line.
@a = <STDIN>; sets @a to all the rest of the input lines.
@a = (1..3); same as @a = (1, 2, 3);
@a = (); same as @#a = -1; i.e. make null list.
Associative Arrays:
%a( 'Mon', 'Monday', 'Tue', 'Tuesday', ) Assigment
$b = $a{'Mon'} To access
%a = (); Make null list
%ENV Built in i.e. $home = $ENV{'HOME'}
%SIG Built in
Intrinsic Functions:
split(/char/,@a) ;* also make it as a LIST
splice(@a,offset,length[,LIST]);* like matparse
join('char',@a) ;* concatenate a list like reuse
chop($number = <STDIN>); # input number and remove newline
which means the same thing as
$number = <STDIN>; # input number
chop($number); # remove newline
%longday = ("Sun", "Sunday", "Mon", "Monday", "Tue", "Tuesday",
"Wed", "Wednesday", "Thu", "Thursday", "Fri",
"Friday", "Sat", "Saturday");
Because it is sometimes difficult to read a hash that is defined
like this, Perl provides the => (equal sign, greater than) sequence
as an alternative separator to the comma. Using this syntax (and
some creative formatting), it is easier to see which strings are
the keys, and which strings are the associated values.
%longday = (
"Sun" => "Sunday",
"Mon" => "Monday",
"Tue" => "Tuesday",
"Wed" => "Wednesday",
"Thu" => "Thursday",
"Fri" => "Friday",
"Sat" => "Saturday",
$answer = 42; # an integer
$pi = 3.14159265; # a "real" number
$avocados = 6.02e23; # scientific notation
$pet = "Camel"; # string
$sign = "I love my $pet"; # string with interpolation
$cost = 'It costs $100'; # string without interpolation
$thence = $whence; # another variable
$x = $moles * $avocados; # an expression
$cwd = `pwd`; # string output from a command
$exit = system("vi $x"); # numeric status of a command
$fido = new Camel "Fido"; # an object
@home = ("couch", "chair", "table", "stove");
$home[0] = "couch";
$home[1] = "chair";
$home[2] = "table";
$home[3] = "stove";
Logical Operators Example Name Result
$a && $b And $a if $a is false, $b otherwise
$a || $b Or $a if $a is true, $b otherwise
! $a Not True if $a is not true
$a and $b And $a if $a is false, $b otherwise
$a or $b Or $a if $a is true, $b otherwise
not $a Not True if $a is not true
Numeric and String Comparison Operators Comparison
Numeric String Return Value
Equal == eq True if $a is equal to $b
Not equal != ne True if $a is not equal to $b
Less than < lt True if $a is less than $b
Greater than > gt True if $a is greater than $b
Less than or equal <= le True if $a not greater than $b
Comparison <=> cmp 0 if equal, 1 if $a greater, -1 if $b greater
Example Name Result
-e $a Exists True if file named in $a exists
-r $a Readable True if file named in $a is readable
-w $a Writable True if file named in $a is writable
-d $a Directory True if file named in $a is a directory
-f $a File True if file named in $a is a regular file
-T $a Text File True if file named in $a is a text file
$a = 5; # $a is assigned 5
$b = ++$a; # $b is assigned the incremented value of $a, 6
$c = $a--; # $c is assigned 6, then $a is decremented to 5
$line .= "\n"; # Append newline to $line.
$fill x= 80; # Make string $fill into 80 repeats of itself.
$val ||= "2"; # Set $val to 2 if it isn't already set.
$a = 123;
$b = 3;
print $a * $b; # prints 369
print $a x $b; # prints 123123123
while (defined ($line = <DATAFILE>)) {
chomp $line;
$size = length $line;
print "$size\n"; # output size of line
Because this is a common operation and that's a lot to type, Perl
gives it a shorthand notation. This shorthand reads lines into
$_ instead of $line. Many other string operations use $_ as a
default value to operate on, so this is more useful than it may
appear at first:
while (<DATAFILE>) {
print length, "\n"; # output size of line
@lines = <DATAFILE>;
$whole_file = <FILE>; # 'slurp' mode
% perl -ne 'BEGIN { $/="%%\n" } chomp; print if /Unix/i' fortune.dat
The truncate function changes the length of a file, which can be
specified as a filehandle or as a filename. It returns true if the
file was successfully truncated, false otherwise:
truncate(HANDLE, $length)
or die "Couldn't truncate: $!\n";
truncate("/tmp/$$.pid", $length)
or die "Couldn't truncate: $!\n";
seek(LOGFILE, 0, 2) or die "Couldn't seek to the end: $!\n";
seek(DATAFILE, $pos, 0) or die "Couldn't seek to $pos: $!\n";
seek(OUT, -20, 1) or die "Couldn't seek back 20 bytes: $!\n";
The sysread and syswrite functions are different from their <FH>
and print counterparts. They both take a filehandle to act on, a
scalar variable to either read into or write out from, and the
number of bytes to read or write. They can also take an optional
fourth argument, the offset in the scalar variable to start reading
or writing at:
$written = syswrite(DATAFILE, $mystring, length($mystring));
die "syswrite failed: $!\n" unless $written == length($mystring);
$read = sysread(INFILE, $block, 256, 5);
warn "only read $read bytes, not 256" if 256 != $read;
$count = `wc -l < $file`;
die "wc failed: $?" if $?;
You could also open the file and read line-by-line until the end,
counting lines as you go:
open(FILE, "< $file") or die "can't open $file: $!";
$count++ while <FILE>;
# $count now holds the number of lines read
Here's the fastest solution, assuming your line terminator
really is "\n":
$count += tr/\n/\n/ while sysread(FILE, $_, 2 ** 16);
Processing Every Word in a File
while (<>) {
for $chunk (split) {
# do something with $chunk
#define UT_LINESIZE 12
#define UT_NAMESIZE 8
#define UT_HOSTSIZE 16
struct utmp { /* here are the pack template codes */
short ut_type; /* s for short, must be padded */
pid_t ut_pid; /* i for integer */
char ut_line[UT_LINESIZE]; /* A12 for 12-char string */
char ut_id[2]; /* A2, but need x2 for alignment */
time_t ut_time; /* l for long */
char ut_user[UT_NAMESIZE]; /* A8 for 8-char string */
char ut_host[UT_HOSTSIZE]; /* A16 for 16-char string */
long ut_addr; /* l for long */
$APPDFLT = "/usr/local/share/myprog";
do "$APPDFLT/";
do "$ENV{HOME}/.myprogrc";
If you want to ignore the system config file when the user has their own, test the return value of the do.
do "$APPDFLT/"
do "$ENV{HOME}/.myprogrc";
($red, $green, $blue) = (0..2);
($name, $pw, $uid, $gid, $gcos, $home, $shell) = split(/:/, <PASSWD>)
Patern match
. Matches any character except newline
[a-z0-9] Matches any single char in set
[^a-z0-9] Matches any single char not in set
\d Matches a digit, same as [0-9]
\D Matches a non-digit, same as [^0-9]
\w Matches an alphanumeric (word) char [a-zA-Z0-9_]
\W Matches a non-word char [^a-zA-Z0-9_]
\s Matches a whitspace char (space, tab, newline...)
\S Matches a non-whitespace char
\n Matches a newline
\r Matches a return
\t Matches a tab
\f Matches a formfeed
\b Matches a backspace (inside [] only)
\0 Matches a null char
\000 Matches a null char because...
\metachar Matches the char itself (\|,\.,\*...)
(abc) Remembers the match for later backreferences
\1 Matches whatever first of parens matched
\2 Matches whatever second set of parens matched
\3 and so on ...
x? Matches 0 or 1 x's where x is any of above
x* Matches 0 or more x's
x+ Matches 1 or more x's
x{m,n} Matches at least m x's but no more than n
abc Matches all of a, b and c in order
fee|fie|foe Matches one of fee, fie or foe
\b Matches a word boundary (outside of [] only)
\B Matches a non-word boundary
^ Anchor matches begining of line or string
$ Anchors match to end of line or string
Translate to your Language
- BI
- Big Data
- BO
- BO Universe Context
- BusinessObjects Context
- BusinessObjects Universe Context
- Cognos
- Command Line DataStage Job Export
- DataStage
- DataStage Command Line Compile
- DataStage Job Compile in command line
- DataStage SCD
- DataStage Server JobCompile Script
- DB
- DW
- Optimizing BO Universe and Reports
- Optimizing Business Objects Universe and Reports
- Other
- Q&A
Disclaimer Statement
Total Pageviews
- BI
- Big Data
- BO
- BO Universe Context
- BusinessObjects Context
- BusinessObjects Universe Context
- Cognos
- Command Line DataStage Job Export
- DataStage
- DataStage Command Line Compile
- DataStage Job Compile in command line
- DataStage SCD
- DataStage Server JobCompile Script
- DB
- DW
- Optimizing BO Universe and Reports
- Optimizing Business Objects Universe and Reports
- Other
- Q&A
Follow us on FaceBook
Powered by Blogger.
- BI
- Big Data
- BO
- BO Universe Context
- BusinessObjects Context
- BusinessObjects Universe Context
- Cognos
- Command Line DataStage Job Export
- DataStage
- DataStage Command Line Compile
- DataStage Job Compile in command line
- DataStage SCD
- DataStage Server JobCompile Script
- DB
- DW
- Optimizing BO Universe and Reports
- Optimizing Business Objects Universe and Reports
- Other
- Q&A
Blog Archive
- Datastage Universe Quick Refrence
- Conformed Dimensions with example
- BusinessObjects - list of values (LOV)
- What is ETL Mapping Document ?A Real Time Example
- Types Database Schemas
- Types Database Schemas
- Data Warehousing Objects
- Data Warehousing Objects
- Unix Vi Quick Reference
- Unix Vi Quick Reference
- PERL Quick References
- PERL Quick References
- Basic korn shell notes
- Basic korn shell notes
- Normalization...
- Normalization..
- What is Factless Fact Table?
- What is Factless Fact Table ?
- Fact Table Loading Types
- Fact Table Loading Types
- Data Warehouse "Datastage " Staging Area
- Data Warehouse Loading Techniques
- Data Warehouse Loading Techniques
- Why do we need a Data Warehouse Staging Area?
- Why do we need a Data Warehouse Staging Area?
- Datastage Error and reject record Handling simplified
- Datastage Error and reject record Handling simplified
- Traditional BI VS In-Memory QlikView
- Traditional BI VS In-Memory QlikView
- Adding a key field to an existing “Hash” file (con...
- Adding a key field to an existing “Hash” file (con...
- Use of Default or Dummy row in dimension table
- Use of Default or Dummy row in dimension table
- Data Profiling and its importance
- Data Profiling and its importance
- Materialized View
- Materialized View
- Slowly Changing Dimensions
- creating a unique counter in DataStage jobs
- Performance Analysis of Various stages in DataStag
- Performance Analysis of Various stages in DataStag
- creating a unique counter in DataStage job
- creating a unique counter in DataStage job
Popular Posts
Netezza dosen't have any isnumeric check function, so in order to find out whether the column has non-numeric, use the following logic n...
If you have table like below GROUP_NAME GROUP_ID PASS_FAIL COUNT GROUP1 5 FAIL 382 GROUP...
@echo off :: ----------------------------------------------------------------- :: DataStageExport.bat :: -----------------------------------...
Netezza has 3 internal planner Fact Relationship Planner (factrel_planner) Snowflake Planner Star Planner In the course of a query pla...
What is Big Data? Big data is data that exceeds the processing capacity of traditional database systems. The data is too big, moves too fast...
Source Q1) Tell me what exactly, what was your role? A1) I worked as ETL Developer. I was also involved in requireme...
One of the regular viewer of this blog requested me to explain the important's of the ETL mapping document. What is ETL Mapping Document...
I am sure most of you heard about market buzz words nosql,newsql... and it often make our DW developers to get confused on this new terms.Mo...
In data warehousing, a conformed dimension is a dimension that has the same meaning to every fact table in the structure. Conformed dim...
This note assumes some familiarity with the DataStage transformation engine. DataStage is normally used to process multiple input files and/...
Wednesday, December 26, 2012
PERL Quick References
by Unknown | 
in Other
at 8:13 PM
- Netezza ISNUMERIC Data Check Logic
- How to do PIVOT in netezza SQL
- Datastage DSX Files Export Script
- Netezza Optimizer Parameters
- Big Data and Hadoop Questions and Answers
- Informatica Job Interview Question & Answers
- What is ETL Mapping Document ?A Real Time Example
- NOSQL 101
- Conformed Dimensions with example
- Data Generation Using DataStage