Monday, December 12, 2005

Encryption and security: an overview

This is an article I wrote a while ago for my SlashDot journal. Cross-posted here and updated for your perusal.

Q:

I would really have loved a debate / educational description from the guru's / opportunity to learn something myself about silc over ssh vs. the "secure " setting in yahoo vs. outlook vs. pine security maybe some opinions about the "I don't get viruses because I'm opensource" vs. uSoft vs. apple, etc.

A:

OK. Here goes. You asked for it.

I'm unfamiliar with "silc", so I won't talk about it. (Gee, that was fast.)

Encryption

Encryption comes in two forms: symmetric and asymmetric.

Symmetric encryption is anything where the same key is used to encrypt and to decrypt. As a really trivial example, an encryption scheme where you add one to every byte in the file. In this "add" algorithm, 1 is the key. It is symmetric, because you can subtract that same key from an encrypted message to get the original back.

Of course, a simple "add" encryption scheme is pretty useless and is easily broken. But many more complicated schemes (like DES and AES) are also symmetric. Although the decryption algorithm may be different, the decryption key will still be symmetric.

Symmetric encryption can be very secure, but it has one big loophole. Both the sender and the receiver need to have the same key. Which means they have to trust each other. If you send me a file encrypted with a symmetric algorithm, you have to give me your key as well. If a third-party gets your key, he can intercept your document in transit, decrypt it, modify it, re-encrypt it, and send on the result. I have no way of knowing that the file was tampered with.

Asymmetric encryption solves this problem. With asymmetric encryption, the key you need to decrypt a message is different from the one you used to encrypt it. To be properly secure, it should not be possible to derive one key from the other (or at least not without a LOT of work.)

Most modern encryption systems use (at least partly) a thing called public key encryption. This is a variation on asymmetric where the key pairs are interchangeable. That is, if I have a pair of keys (A and B), I can encrypt with either one, and the result can be decrypted by the other. When you generate a pair of keys, you arbitrarily declare one to be your public key and the other to be your private key. You keep the private key to yourself and never give it out to anybody. You give the public key to anybody who wants it (posting on web pages is not unusual.)

Now, suppose you send out a document encrypted with your private key. Anybody in the world can read it (because you've made your public key available for the taking), but because they have to use your public key to decrypt it, they know that you were the originator of the message. If someone else tried to forge the message in your name, your public key wouldn't work on it. If a third party decrypts your document and alters it, he can't re-encrypt it, because he doesn't have your private key.

Similarly, suppose I send out a document and I encrypt it with your public key. Only your private key will be able to decrypt it. So I know that only you can read the message. But anybody could have sent it to you (since everybody can get your public key.)

Either of these scenarios is useful. But ideally, you want the advantages of both. You want to make sure that only I can read your message, and you want it to be impossible for anyone to forge your identity. The solution is simple. You encrypt the message twice. Once with your private key, and again with my public key. In order to read the message, I need to decrypt it twice - once with my private key and then again with your public key. I know the message came from you (because your public key worked) and you know nobody else read it (because my private key is needed to decrypt the message.)

Public key encryption is computationally expensive, so it generally isn't used for encrypting actual documents. Instead, it will be used as a part of a key-exchange algorithm. A symmetric key will be randomly generated, used for one session only, and will then be discarded. Public key encryption is used for one side of the connection to give the key to the other side, so that third parties can't intercept it.

SSH

SSH is primarily meant to be used as a replacement for telnet (and related utilities like rlogin, rsh and rexec). The reason for this is that the telnet protocol (and rlogin/rsh/rexec) have minimal security. They do not encrypt their connection in any way.

This means that anybody sharing a network with you, the remote host, or any network in between, can see everything you're doing. It's trivially easy to get a PC to intercept all the packets that flow through an Ethernet network, and it's often not too difficult for an administrator to do this on other kinds of networks.

Someone snooping on a telnet/rlogin/rsh/rexec session can easily mirror your session. He can see everything you type (including passwords) and everything you see/download.

This doesn't often happen, because most internet service providers are trustworthy. But there are still cases where people on corporate and campus LANs have stolen sensitive data this way.

This also happens over wireless networks, like those found at trade shows and internet cafe's. WEP (Wired Equivalency Privacy) provides some degree of security, but WEP can be cracked fairly easily these days, and it won't stop someone who knows the encryption key (like anyone that has paid for time on the wireless LAN.) Wireless Protected Access (WPA) is more secure, but if history is any guide, it will eventually be cracked (and replaced with something even more secure.)

ssh solves this problem by encrypting everything. It also provides some optional features that will only allow people with pre-assigned security credentials access. So someone snooping your packets will be unable to view the content. (More accurately, the amount of computer power needed to break the encryption will be greater than what most people will be willing to expend.)

Although I regularly use telnet/rlogin/rsh/rexec for local traffic (between two computers at home or two computers at work) I try to avoid using it over the internet.

Also as a part of the ssh distribution is "sftp" which is an encrypted version of the FTP protocol. FTP suffers from the same problem as telnet - passwords are sent without encryption, so a third party could intercept them.

Secure web pages

You can encrypt web connections, as long as the server supports it. Encrypted URLs usually begin with "https:". (The "s" stands for "secure".) The HTTPS protocol can use a wide variety of different encryption standards, some more secure than others.

Most web browsers will let you know when the page is secured. Typically with a padlock icon or other appropriate icon in some corner of the browser window. You can usually click on this to view the page's security information (including the kind of encryption, the certificate providing the encryption keys, and the identity of the authority that generated the certificate.)

Any kind of encryption will keep third parties from snooping your packets. Encryption protocols with more bits in the key will be harder for third parties to crack, although anything can be cracked by someone determined enough.

You still have to decide whether the server on the other end can be trusted, of course. This is where the certificate and certificate authority comes in. A site's certificate identifies the owner of the site (usually including name and address contact information.) The certificate's data includes some encryption/authentication data to keep a third party from tampering with it. Part of that information involves a certificate from a certificate-agency. This means that that agency is vouching for the content of the certificate.

The idea here is that a rogue web site may try to impersonate a real one. For instance, an identity theft ring may create a server that looks like Citibank's server. Citibank will, of course, have a certificate that identifies them, but the rogue site will probably also have a certificate. So that alone isn't enough to make things secure.

The certificate authority takes care of that. Citibank's real certificate will be signed by an authority (Citibank uses VeriSign, if you're curious). When you choose to view the certificate, you will see that it is signed by VeriSign. Your web browser can use VeriSign's well-known public-key (built-in to most browsers, and available for download from others) to validate that the certificate is, in fact, genuine.

If the rogue site tries to use the same certificate, it won't match the server and your web browser should alert you. If they try to alter the certificate, it will no longer validate against VeriSign's public key - only VeriSign can issue certificates that their public key can decode.

Of course, you still have to decide whether to trust the remote server or not. No protocol will help you here. But HTTPS will let you be certain that the remote server really is who you think it is, and that no third party will intercept your session.

Microsoft Outlook

When people talk about Outlook's security, they are usually talking about something else. The problem there is how Outlook handles attachments.

People attach all kinds of files to all kinds of mail messages all the time. If I e-mail you a picture of somebody, it gets sent as an attachment. If I e-mail you a Word file I want you to read, it gets sent as an attachment. And if I e-mail you a program, it gets sent as an attachment. If I send you an HTML-formatted mail message with an image or background music or something, attachments will be used there as well.

All mail programs (well, all except the oldest and simplest ones) can deal with attachments. Either they make you save the attachment, or they may launch an external program for viewing it, or they may launch a plugin to view it, or they may be able to view it directly.

Outlook's problem comes from the way it launches attachments. There have been many bugs (most have been fixed by now, I believe) that will cause Outlook to automatically execute a program or a script that is sent in an e-mail attachment (usually with the message malformed in such a way as to trigger the bug.)

Once this happens, the program is running on your computer like any other program. As such, it can do anything. Worms (which are effectively viruses that spread to other computers on their own) often exploit this. They will e-mail themselves to others in a way that takes advantage of these kinds of bugs, so that the receiver will end up auto-launching the program, causing the worm to spread further.

The reason so many people hate Outlook is that these kinds of auto-launch bugs are extremely rare (and sometimes unheard of) in other mail programs, but Outlook has had tons of them over the years.

But the worms have been getting trickier. As Microsoft has fixed the various bugs that can cause executable attachments to auto-launch, the people developing the worms have gotten sneakier. Usually, they take advantage of human gullibility.

For instance, the worm may include (in the mail message's text) a message telling you that the attachment is a critical system patch from Microsoft, or an updater to some popular program, or a program needed to prevent your bank account from closing, or other similar gimmick. The user who trusts this message and runs the program gets infected with the virus/worm, and it spreads further on.

Antivirus programs routinely check e-mail for viruses and worms these days, so this works less often than it used to. So the viruses now often pack themselves into zip files for e-mailing. The idea is that virus scanners may avoid scanning a zip file. The message will direct the user to unpack the zip and run the contents - at which point the virus gets launched.

But virus scanners now scan the contents of zip files.

Which is where virusses like "Beagle/Bagle" come in. Zip files support built-in encryption. This way, only an authorized user can see the contents of the file. Obviously, if the zip file can't be opened, then a virus scanner can't scan the contents.

So the Beagle/Bagle virus sticks itself into an encrypted zip file. In the e-mail it sends itself through, it tells the recipient what the decryption key is. If someone is gullible enough to expand the zip, enter the decryption key, and run the contents, the virus will run.

The scary thing is that these viruses do spread. There are thousands of people who have been tricked into decrypting and executing the virus.

Macro viruses

In addition to these kinds of viruses, some popular tools that have macro languages (like the parts of Microsoft Office) can also be vectors for spreading viruses. A macro in a Word/Excel/PowerPoint document can spread to other documents and can even e-mail itself elsewhere.

Fortunately, today's virus scanners are smart enough to scan the contents of office documents. It's also easy to disable all macro capability in MS Office, which is usually a good idea, since very few people actually use them.

E-mail security

To be on the safe side, many people simply refuse to look at any attachments whatsoever. And with web-mail services and AOL (and possibly a few others) this is easy - you don't even have to download the attachment from your server if you want to delete it without opening it.

But IMO, this is overkill. And it's not an option for many people.

For instance, I often send and receive pictures with my friends and relatives. I also send and receive Microsoft Office documents all the time as a part of my job.

Fortunately, it is easy to spot executable attachments with most mail programs. Look for a MIME type like application/octet-stream (a generic binary file, usually used for program files), or application/vbs (Visual Basic Script), etc. And look for file extensions like .exe, .vbs, .com, .bat, .pif, etc. If you have Windows configured to hide extensions (the factory default setting), change that configuration so you can see them.

As a second line of defense, get a good virus scanner (most people I know use either NAI/McAfee or Symantec/Norton) and keep it up to date. Updates come out frequently - sometimes more than once per day. Antivirus packages all include an auto-update facility where they will periodically download updates from the publisher's web site. Home editions of these program require you to buy annual subscriptions to keep getting updates, while corporate editions typically do not.

Non-virus E-mail security

In addition to viruses, there are a few other potential security risks to e-mail that some programs make you vulnerable to.

For instance, HTML e-mail. HTML is useful and cool. It lets you send mail with nice formatting, colors, fonts, images, etc. You can also include links to remote sites, and have remote references to objects on web servers (like images, sounds, etc.)

The problem is that this can create a security problem. For instance, suppose I send you an e-mail with a reference to an image on my web server. You open the mail, and your mail program fetches the image from my sever. You see the image and all's well.

Now suppose I'm a spammer and do the same thing. You think "no big deal" and just delete the spam. But I own the web server. Suppose I send out a million spams, and that image-reference has a slightly different name in each one. I keep a list of which e-mail addresses got which image-references. I can look at my web server's log file and find out which of those image-references were used to download images, match them against my list, and bingo! I now know which people actually read the spam (as opposed to those that deleted it without opening, or those that never received it.) Since I know somebody's reading spam at that address, I'm going to start sending him lots more.

Some mail programs and web sites (like Thunderbirdand Yahoo) provide options to block all remote-image references that appear in mail. This keeps the spammers from knowing that you've read their spam.

Also, HTML e-mail can contain Java applets and JavaScript. These are normally not dangerous, but there have been bugs that allow them to be used for spreading viruses. But they can sometimes open network connections to a remote web server, where they can alert a spammer that his mail has been read.

Fortunately, good mail clients let you disable Java and JavaScript in e-mail as well.

Finally, e-mail may contain data for popular plugins, like Shockwave Flash files. Most plugins are reasonably safe, but some are not. And some of the safe ones can open up network connections, which can alert a spammer that the spam has been read. Again, good mail clients let you disable plugins from e-mail.

Of course, you do not want to have your mail client set to automatically generate return receipts, even if you receive mail that asks for them. Again, this will alert spammers that someone's reading the spam. If you want your e-mail client to generate receipts, configure it to ask for confirmation first, so you won't send it to spammers.

And NEVER send mail to a spammer's "unsubscribe" address. If you do, you'll give him concrete proof that someone at your address is reading the spam. You'll end up getting more spam, not less.

Mail client recommendations

WRT pine, I can't help here. I haven't used pine in a very long time, and I didn't bother to learn much about it back then.

FWIW, I run Thunderbird, from the Mozilla group. It has many good anti-spam features (disable Java, disable JavaScript, convert HTML to plain-text, don't load remote images, and self-learning spam filters) that I find very useful.

Microsoft has put some of this into the newest versions of Outlook, but existing copies leave a lot to be desired in this department.

People who just don't get viruses

In the "I don't get viruses" department, it's true. Some people don't get them. But if anyone says that they can't get them, they're just lying.

A savvy Windows user can almost always eliminate viruses. You can use a mail client with good security in its design. You can install a virus scanner set to scan all files and have it auto-update on a daily basis. You can turn off unused network services. You can use an intrusion detection package (like ZoneAlarm) and you can use a firewall on your network.

But even with all this, it is possible to get a virus. Everybody's human and it is always possible that someone may trick you into running an infected program.

There have even been cases where a software publisher has been infected, and the virus spread through CDs bought in stores. I, personally, have been infected by a virus that came in through Microsoft's "Windows Update" server. But if you stay alert and make sure your virus scanner is always running, you can reduce your risk to a minimum and minimize the damage when something hits.

People using other operating systems can afford to be a bit more cavalier. There is a certain amount of "security through obscurity". People don't bother making many viruses for Linux, Mac OS, OS/2, BeOS and other low-popularity operating systems. Not because it's impossible, but because such a virus won't spread very far. If I infect every Mac that exists, I only get 3-5% of the computers in the world. If I infect 10% of the Windows machines, that's 9.5% of the world. If I infect every OS/2 system that exists, I probably get only a fraction of a percent of the total amount of computers.

In other words, writing viruses for Windows gives the greatest bang for your buck.

10-15 years ago, when most computers on the internet were university mainframes, you found a lot more UNIX-based worms. But now the quantity of UNIX boxes on the internet are dwarfed by the number of Windows PCs attached to DSL lines and cable modems. And these PCs are every bit as connected as those university mainframes used to be.

It is definitely possible to write a virus that targets Mac OS, Linux, or anything else. And even though these viruses may require administrator access to do real damage, I'm sure that lots of users could be conned into typing in their administrative passwords, just like Beagle/Bagle got lots of people to manually decrypt a zip file in order to run the virus. And I'm sure lots of Mac/Linux home users run logged in as root or administrator accounts (just like most Windows NT/2000/XP users do their work from administrator accounts.)

If Linux or Mac OS or anything else should someday become really popular, and not just niche products, I have no doubt that we'll start seeing lots of viruses that target these platforms. These operating systems may have security features that make it harder for viruses to auto-install, but as long as we have gullible users that do whatever random e-mails tell them to do, viruses will remain a fact of life.

And, of course, this is ignoring viruses that are scripts carried in documents (like MS Office documents). These can execute on Macs, just like on Windows, since the Mac version of Office uses a compatible scripting language.

No comments: