| [ | Tags | | | code, coding, crack, cracking, eval, exploiting, exploits, hack, hacking, html, injection, perl, prevention, regex, shell, sql | ] |
| [ | Current Location |
| | Home | ] |
| [ | Current Music |
| | Sam Cardon and Kurt Bestor - Rainmaker | ] |
I'm running out of time for a post to
Planet Perl Iron Man,
so I'm going to prepare something quick, but hopefully enlightening. I'll use
Perl as a demonstration language, but the practices I'm going to cover
tend to be more universal.
Starting from the early UNIXes and before, operating systems represented code
and its data as sequences of units of a certain number of bits called
bytes. Starting from Unix,
which was designed to run on such machines as the 16-bit PDP-11 and later on
the 32-bit VAX family of computers, this unit has generally become 8-bits.
Today, there are many 8-bit based text encodings (and many more binary
encodings for binary data), and the interested reader is referred to
Joel Spolsky's
introduction on the subject
and
Juerd Waalboer's
perlunitut or your language's equivalent document.
In any case, let's suppose we have a string where we want to embed a
variable containing a string. In Perl we can do:
my $total = "Hello " . $adjective " World!";
Or more simply:
my $total = "Hello $adjective World!";
So if we put "beautiful" in $adjective, we'll get
"Hello beautiful World!" in $total and if we put
"cruel" there we'll get "Hello cruel World!" there.
So far so good, if it's a plain string written in plaintext. However, what
if it's in a more well-formed format? Let's say HTML:
# Untested
my $input = get_input_from_user_somehow();
print <<"EOF";
<p>
$input
</p>
EOF
The alert reader will notice that $input was inserted as is into the HTML
output. And since we didn't check if it contains special characters or
escaped its special characters, a malicious user can insert arbitrary HTML
code and even JavaScript code there. This in turn can wreck havoc upon the
users of the page.
This form of HTML injection is called a
a
cross-site-scripting attack (XSS). If present in web applications or
web-sites, it may allow malicious crackers to set up traps to the unwary,
and possibly gain access to sensitive information on the site, such as the
passwords of users or administrators. And you did notice how easy it was
to write code that exhibited this problem, right?
Here are some other forms of code or markup injection:
-
Shell Command Injection - I've discussed it briefly
in
a different post about "shell variable injection" in Bash, but it also
exists in Perl. Imagine doing system("ls $dir"); or
as some newcomers are tempted to do `ls $dir`, which the latter
still has some legitimate uses. Now I as a malicious user can put in the
$dir variable some malicious shell code which will wreck havoc on
the system of the user that is running the script.
-
SQL injection
allows a user to inject malevolent SQL code that can do untold damages
in the database. It is very common in web applications and many
other applications that use SQL code. If you do something like
"SELECT id FROM users WHERE name='$name'" then by putting
single-quotes in the name, and using SQL syntax one can insert arbitrary SQL
there and do a lot of damage. There was also
a very nice xkcd comic about it.
-
Perl Code injection - let's suppose we want to construct an optimised
anonymous function ( sub { ... } ) on the fly. We can build
its code and then use
the
string eval - eval "". A lot of Perl programmers think it should be
avoided at all costs, but
metaprogramming
has some legitimate uses. Moreover, this can happen in other cases,
like when we construct a Perl program (or a program of any other language
on the fly and execute it).
In any case, if we insert a variable into the eval "" which was input
from the user without being escaped or validated, we can have an arbitrary
code execution.
-
Regular expressions' code injection - imagine you want to see if a
string is contained in a list of strings. One naïve way would be to concatenate
the strings using a separator that is unlikely to be contained in them
(such as \0) and then match this gigantic string using
$haystack =~ m{$needle}. However, if $needle contains special regex characters,
then the operation can take a lot of time to match or worse - yield an
incorrect result. One way to avoid that is to use
perldoc -f quotemeta or its \Q and \E regular expression escapes -
$haystack =~ m{\Q$needle\E}g. In this particular case, it is also
probably better to use a hash, but naturally this was just one example
where we'd like to embed some arbitrary (but plain) text inside a regular
expression.
These are the prominent examples I can think of now, but they are not
the only ones. Your program is in danger whenever it accepts text input from
the user and passes it directly to an output format that has some grammar and
syntax that can be influenced by this string.
So how to mitigate such code injection problems? There are many ways -
sometimes providing alternatives and sometimes complementing each other:
-
Make sure you have enough discipline to escape the input before it
is passed to the output venue. Write automated tests for that.
-
If you still want to allow some user input, then make sure that you
analyse it to make sure it doesn't contain any malicious code
that can abuse the system. For example, you may wish to restrict input
only to certain HTML tags and attributes.
-
Taint your data using unsafe typing or "kinding" and make sure that
it can only be output after either being escaped or being untainted. Joel
on Software recommends
making the
wrong code look wrong, which while desirable and important, is
probably less preferable than the wrong code to behave wrong
and abort with a huge "You suck!" error or something. This may not be
very possible given certain limitations of the programming language, but it
is a better ideal.
-
Use auto-escaping features of your environment such as SQL place-holders
(e.g: "SELECT * FROM mytable WHERE id = ?"), and the list argument of
"perldoc
-f system" (e.g: system { $cmd[0] } @cmd).
-
Perform frequent code reviews, black box tests, and encourage hackers to
find problems in your code.
-
Use complementary security measures that make sure that even if a problem
occurs, its damage is mitigated. As examples, you can try running the
script under an underprivileged operating system user, or as a database
user that lacks certain database privileges.
There are probably several measures that I'm forgetting, so feel free to
add them as comments or trackbacks. In any case, be careful when writing
code that may cause code or markup injection because the consequences may be
dire.
|