Introduction & Encryption

Adapted, with permission, from the Data Security and Ethics lecture materials by Martin Lester (University of Reading).

Cryptography

Encryption algorithms transform data to protect confidentiality.

Data that has not been encrypted is called plaintext.

Encrypting the plaintext produces ciphertext.

Decrypting the ciphertext recovers the original plaintext.

Cryptography is the study of these operations.

Almost all online services rely on cryptography for security.

Encryption keys

Most encryption algorithms use a key to encrypt and decrypt data.

Often, the key is just a sequence of random bits.

In an algorithm with symmetric keys, the same key is used to encrypt and decrypt.

In an algorithm with asymmetric keys, different keys are used for encryption and decryption.

(More on this later.)

Security through Obscurity vs Open Design

With a good encryption algorithm, unless you know the key, it is difficult or impossible to decrypt the ciphertext, even if you know the algorithm.

If you keep your encryption algorithm secret, it may be even harder...

...but you can't keep it secret if you distribute it as software...

...or hardware!

One Time Pad (OTP)

One of the simplest encryption methods is the One Time Pad.

Recall the bitwise XOR operation:

  • 0 XOR 0 = 0
  • 0 XOR 1 = 1
  • 1 XOR 0 = 1
  • 1 XOR 1 = 0

Observe:

  • x XOR y XOR y = x

Suppose y is 1 with probability 0.5...
what is probability (x XOR y) is 1?

So:

  • Let: c = p XOR k
  • Then: c XOR k = (p XOR k) XOR k = p

Encrypt a bitstring of any length by XORing with a random key of the same length.

Decrypt by XORing with the same key.

Perfect secrecy, but only if the pad (key) is kept secret and only used once.

Stream Cipher

Problems with a One Time Pad: * Can only use "key" once. * Difficult to generate long, truly random bitstring.

Solution: * Use a shorter key to seed a pseudo-random number generator. * Use stream of pseudo-random bits (keystream) instead of key.

Still secure enough in practice, provided it is hard to work out key from keystream.

ChaCha stream cipher

  • Developed in 2008 by Daniel J. Bernstein
  • Modification of his earlier 2005 cipher Salsa20
  • 256-bit keys
  • Uses sequence of add-rotate-XOR (ARX) operations to generate pseudorandom keystream
  • Keystream is XORed with plaintext to generate ciphertext
  • Similar idea to one-time pad, except keystream is not truly random
  • Ciphertext appears to be random bitstream
  • Only stream cipher permitted in TLS 1.3

Asymmetric Key Encryption

One big problem with symmetric key encryption:

To send a message, both sender and recipient need the same key.

How can the sender transmit the key securely to the recipient?

Solution: Use asymmetric key encryption algorithms, where encryption and decryption key are different.

Asymmetric or public key encryption needs 3 operations:

  • KeyGen() — generate pair (sk, pk)
  • sk — (private/secret) decryption key
  • pk — (public) encryption key
  • Enc(pk, m) — encrypt m with pk
  • Generate ciphertext c
  • Dec(sk, c) — decrypt c with sk
  • Return original message m

Knowing just pk, it is difficult to find sk.

Recipient can safely send pk over network.

Beyond encryption

Encryption protects the secrecy (confidentiality) of data, but there are other applications of cryptography.

Digital signatures can provide:

  • authentication — receiver sure who sent message;
  • integrity — receiver sure messages was not tampered with;
  • non-repudiation — sender cannot deny sending message.

Digital Signatures

3 operations:

  • KeyGen() — generate pair (sk, vk)
  • sk — (private) signing key
  • vk — (public) verification key
  • Sign(sk, m) — sign m with sk
  • Generate signature σ
  • Verify(vk, m, σ) — verify signature
  • Was σ output of Sign(sk, m)?

In contrast with asymmetric key encryption, private key is used by sender.

Using cryptography

How to use cryptography in your application?

DO NOT TRY TO MAKE UP YOUR OWN ENCRYPTION ALGORITHM.

DO NOT TRY TO WRITE AN IMPLEMENTATION OF AN EXISTING ALGORITHM.

DO NOT COPY SOURCE CODE OF AN IMPLEMENTATION OFF A PROGRAMMING FORUM.

DO NOT HARDCODE THE KEY IN YOUR PROGRAM.

How to use cryptography in your application?

Making choices about what algorithm to use is hard.

If possible, use existing infrastructure.

Example: If sending data over the network, try to use TLS.

Otherwise, use an "opinionated" library that makes all the choices for you.

Example: NaCl or Libsodium.

Ethical issues: encryption

Encryption provides confidentiality for both "good" and "bad" people. Is it ethical... * to distribute encryption software? * to restrict distribution by law? * to imprison citizens who won't give keys to police? * for a government to develop and distribute encryption algorithms with secret weaknesses, so they can monitor terrorists? * for a journalist to leak information about the above? * for a phone manufacturer to produce phones that encrypt users' data, so they cannot read it when asked to by the user or the police? * for a developer to release software that uses encryption without external audit?

Cyber security (UK NCSC)

Cyber security refers to the protection of:

  • information systems
    • hardware
    • software
    • associated infrastructure
  • data on them
  • services they provide

from unauthorised:

  • access
  • harm
  • misuse

intentionally or accidentally.

Information security (ISO 27000)

Information security is preservation of:

  • confidentiality — information is not made available or disclosed to unauthorised individuals, entities, or processes
  • integrity — information is accurate and complete
  • availability — information is accessible and usable on demand by an authorised entity

These are sometimes called the CIA triad.

CIA triad

Confidentiality: Encrypt data. Use passwords and 2FA to restrict access.

Integrity: Keep backups of data with checksums. Use digital signatures.

Availability: Store data on a networked computer (or several computers).

No confidentiality:
Store data on CD and post a copy to every household in the country.

No integrity:
Store a backup of your data as an encrypted multi-part RAR and post on Usenet, but turn off your computer before the upload is finished.

No availability:
Store data on a computer with RAID storage in a locked room with no network access.

Data security

So what is data security?

Information is just data plus an interpretation, so data security can just mean information security.

We will mostly take this broad view but...

Data stored on a computer system can be:

  • at rest — not being used, so usually stored on a disk;
  • in use — being used for a computation, so usually stored in memory;
  • in transit — being sent over a network.

Sometimes we may use data security to refer specifically to data not in transit.

Systems, policies and controls

A system could include hardware, software (operating system or applications), administrators or users.

A principal is any entity that participates in a system.
Examples: Microprocessor, webserver, Bob from IT.

A security policy is a statement of what it means for a system and its information to be secure.

A security control is a mechanism for achieving or enforcing a policy.

Attackers, vulnerabilities and exploits

An attacker or adversary seeks to violate the policy, leading to a security failure.

A vulnerability is any weakness in a system that can lead to a security failure, especially when exploited by an attacker.

An attack vector is a method of attacking a system.

An attack surface is the combination of all attack vectors.

Saltzer and Schroeder Principles (1975)

Principles to follow to encourage secure design:

  • Economy of mechanism.
  • Fail-safe defaults.
  • Complete mediation.
  • Open design.
  • Separation of privilege.
  • Least privilege.
  • Least common mechanism.
  • Psychological acceptability.

OWASP Top 10

Most significant (frequency and/or severity) vulnerabilities in Web applications:

  • A01:2021 - Broken Access Control
  • A02:2021 - Cryptographic Failures
  • A03:2021 - Injection
  • A04:2021 - Insecure Design
  • A05:2021 - Security Misconfiguration
  • A06:2021 - Vulnerable and Outdated Components
  • A07:2021 - Identification and Authentication Failures
  • A08:2021 - Software and Data Integrity Failures
  • A09:2021 - Security Logging and Monitoring Failures
  • A10:2021 - Server-Side Request Forgery
  • A11:2021 - Code Quality issues, Denial of Service, Memory Management Errors

Module Outline and OWASP Top 10

How topics we will look at map onto the OWASP Top 10:

  1. Introduction and Encryption * A02:2021 - Cryptographic Failures
  2. Port Scanning and Firewalls * A05:2021 - Security Misconfiguration
  3. Buffer Overflows and Fuzzing * A11:2021 - Memory Management Errors
  4. SQL Injections * A03:2021 - Injection
  5. Noninterference and Security Type Systems * A04:2021 - Insecure Design
  6. Unix Permissions * A01:2021 - Broken Access Control