How To Decode Computer Language: Hex, Binary, Base64, and Bytecode Explained

By John Reed

how to decode computer language — binary hex and ASCII encoding chart

Understanding how to decode computer language is one of the most practical skills a developer, security researcher, or curious technologist can develop. Computers store and transmit data as sequences of numbers, and those numbers get represented in several different formats depending on the context — raw binary in CPU registers, hexadecimal in memory dumps, Base64 in email attachments, and bytecode in compiled Python or Java programs. Each format has its own rules, and once you know the rules, decoding any of them becomes a straightforward mechanical process.

This guide walks through every major encoding format you are likely to encounter, shows you how to identify each one at a glance, and provides step-by-step instructions and Python code snippets for decoding each type. By the end you will also know which tools — both command-line and browser-based — do the heavy lifting for you.

1. The Layers of Computer Language: From Source Code to Machine Code

"Computer language" is not a single thing. It is a stack of representations, each translated from the one above it. To decode something you first need to know which layer you are looking at.

Source code

Source code is the human-readable form that programmers write. Python, JavaScript, Java, and C are all source code. A Python file on disk is plain UTF-8 text that any text editor can open. This layer requires no decoding — it is already readable.

Bytecode

Languages like Python and Java do not compile directly to machine code. Instead they compile to bytecode — an intermediate representation. Python stores bytecode in .pyc files inside the __pycache__ directory. Java stores bytecode in .class files. Bytecode is not raw machine code and does not run directly on the CPU — a runtime (the Python interpreter or the Java Virtual Machine) executes it. Bytecode can be disassembled back into a human-readable mnemonic form; we cover that in section 7.

Assembly language

Assembly language is a thin, human-readable layer directly above machine code. Each assembly instruction corresponds to exactly one machine-code instruction. For example, on x86-64 you might write:

MOV EAX, 1
SYSCALL

That pair of instructions places the value 1 into the EAX register and then triggers a system call — the conventional way to invoke the Linux exit(1) system call in 64-bit mode. Humans can read and write assembly, but almost nobody does so for large programs. It is primarily used when inspecting compiled binaries.

Machine code

Machine code is binary instructions — sequences of 1s and 0s — that the CPU executes directly. A compiled C or C++ program (built with gcc, clang, or MSVC) is machine code. Nobody writes machine code by hand — compilers and assemblers produce it. When you open a compiled binary in a hex editor, what you see is machine code expressed as hexadecimal digits.

Data encodings

Separate from programming languages, there are encoding schemes used to represent raw data in printable or transmissible form. Hexadecimal, Base64, URL percent-encoding, and ASCII are all data encodings. They describe how bytes are written down, not how instructions are structured. Most of this article is about these encodings because they are what you encounter in day-to-day work — in HTTP headers, email MIME parts, log files, and data URIs.

2. How to Identify What Type of Encoding You're Looking At

Before you can decode something, you need to know what it is. The good news is that most encoding formats have distinctive fingerprints visible to the naked eye.

When in doubt, paste the string into CyberChef and use the "Magic" operation — it runs heuristics across all known encoding types and ranks the most likely interpretation.

3. How to Decode Hexadecimal (Hex) to Text

Hexadecimal (base-16) is the most common way to display binary files because it is compact and human-scannable. Each byte — which holds a value from 0 to 255 — is represented as exactly two hex characters. Hex dumps appear in debuggers, network packet captures, and file-format documentation.

Step-by-step: decode hex to ASCII text

Take the hex string 48 65 6C 6C 6F. Here is how to decode it manually:

  1. Split into byte pairs: 48, 65, 6C, 6C, 6F
  2. Convert each pair from hex to decimal: 72, 101, 108, 108, 111
  3. Look up each decimal value in the ASCII table: H, e, l, l, o
  4. Concatenate: Hello
HexDecimalASCII character
4872H
65101e
6C108l
6C108l
6F111o

Decode hex in Python

# Decode a hex string to UTF-8 text
result = bytes.fromhex('48656c6c6f').decode('utf-8')
print(result)  # Hello

The bytes.fromhex() built-in accepts a string of hex characters (with or without spaces) and returns a bytes object. Calling .decode('utf-8') on it converts the bytes to a Python string.

For interactive use, the ASCII decoder tool on this site handles hex-to-ASCII conversion without any coding required.

4. Decoding Binary to Text

Binary is the native language of every computer. Every piece of data — text, images, executables — is ultimately stored as a stream of bits. When binary is written out for humans it is usually grouped into 8-bit bytes, each byte representing one value from 0 to 255.

How binary maps to ASCII

ASCII assigns a number to each printable character. Binary encoding simply writes that number in base 2. The letter H has ASCII value 72. In binary, 72 is 01001000.

BinaryDecimalASCII character
0100100072H
01100101101e
01101100108l
01101100108l
01101111111o

Manual decoding steps

  1. Split the binary string into 8-bit groups.
  2. Convert each group to decimal. For 01001000: 0×128 + 1×64 + 0×32 + 0×16 + 1×8 + 0×4 + 0×2 + 0×1 = 72.
  3. Map the decimal value to its ASCII character.

Decode binary in Python

binary_str = '01001000 01100101 01101100 01101100 01101111'
groups = binary_str.split()
chars = [chr(int(b, 2)) for b in groups]
print(''.join(chars))  # Hello

Note that binary encoding of text is different from a compiled binary executable. A compiled binary contains machine code instructions, not ASCII characters — decoding it requires a disassembler rather than a simple binary-to-ASCII lookup.

The ASCII decoder tool accepts binary input and converts it to readable text in one click.

5. Decoding Base64

Base64 is an encoding scheme that represents arbitrary binary data using 64 printable ASCII characters: A–Z, a–z, 0–9, plus + and /. It was designed to safely transmit binary data through systems that only handle text, such as email. Base64 is not encryption — anyone can decode it instantly.

Recognising Base64

Example: the word "Hello" in Base64 is SGVsbG8=. The trailing = is padding added because the input (5 bytes) is not evenly divisible by 3.

Decode Base64 from the command line

# Encode
echo "Hello" | base64
# Output: SGVsbG8K

# Decode
echo SGVsbG8K | base64 -d
# Output: Hello

Note that when you pipe a string through echo a newline character is appended, so the encoded output includes that newline (which is why you see SGVsbG8K rather than SGVsbG8=).

Decode Base64 in Python

import base64

# Decode Base64 to string
decoded = base64.b64decode('SGVsbG8=').decode()
print(decoded)  # Hello

# Encode a string to Base64
encoded = base64.b64encode(b'Hello').decode()
print(encoded)  # SGVsbG8=

URL-safe Base64

Some systems use a URL-safe variant of Base64 that replaces + with - and / with _. JSON Web Tokens (JWTs) use this variant. In Python, use base64.urlsafe_b64decode() to handle those.

Use the Base64 encoder/decoder tool on this site to decode Base64 strings instantly in your browser.

6. Decoding URL Percent-Encoding

URLs can only contain a limited set of safe characters. When a URL needs to include characters outside that set — such as spaces, equals signs, or slashes — those characters are percent-encoded: replaced by a %sign followed by the character's two-digit hex ASCII value.

EncodedDecoded characterASCII decimal
%20space32
%3D= (equals sign)61
%2F/ (forward slash)47
%3A: (colon)58
%40@ (at sign)64

Decode a URL in Python

from urllib.parse import unquote

encoded = 'Hello%20World%21%20It%27s%20a%20test%3D1'
decoded = unquote(encoded)
print(decoded)  # Hello World! It's a test=1

Decoding form data

HTML form submissions sent with the application/x-www-form-urlencoded content type use percent-encoding with one additional rule: spaces may be encoded as + rather than %20. The Python function urllib.parse.unquote_plus() handles this variant.

Use the URL percent-encoding decoder tool on this site to decode percent-encoded URLs without writing any code.

7. Decoding Python and Java Bytecode

Bytecode is the compiled, platform-independent form of programs written in Python or Java. It is not human-readable straight out of the file, but both ecosystems provide official tools to disassemble it into a readable mnemonic form.

Python bytecode with the dis module

Python compiles source files to bytecode and caches it in .pyc files inside the __pycache__ directory. The standard library's dis module disassembles Python functions or code objects into human-readable bytecode instructions:

import dis

def greet(name):
    return "Hello, " + name

dis.dis(greet)

# Output (CPython 3.12, abbreviated):
#   2           0 RESUME                   0
#   3           2 LOAD_CONST               1 ('Hello, ')
#               4 LOAD_FAST                0 (name)
#               6 BINARY_OP                0 (+)
#              10 RETURN_VALUE

Each line shows the source line number, the byte offset within the code object, the opcode name, and any arguments. This is the Python Virtual Machine's instruction set — not x86 machine code.

Java bytecode with javap

The JDK ships with javap, a class-file disassembler. Run it against any .class file to see the JVM bytecode:

javap -c ClassName.class

# Example output fragment:
#   public static void main(java.lang.String[]);
#     Code:
#        0: getstatic     #7  // Field java/lang/System.out
#        3: ldc           #13 // String Hello, World!
#        5: invokevirtual #15 // Method java/io/PrintStream.println
#        8: return

The -verbose flag adds constant pool information, method signatures, and class metadata. For decompilation back to Java source code (rather than just bytecode mnemonics), tools like CFR or Fernflower produce readable Java from .class files.

8. Tools for Decoding Compiled Binaries

Compiled C and C++ programs are machine code — raw binary instructions for the CPU. Decoding them requires a disassembler (which converts machine code to assembly language) or a decompiler (which attempts to reconstruct higher-level pseudocode). Here are the tools professionals use.

GNU objdump

objdump is available on every Linux system with the GNU Binutils package installed. The -d flag disassembles executable sections:

objdump -d ./mybinary

# Example output fragment:
# 0000000000001149 <main>:
#     1149:       55                      push   %rbp
#     114a:       48 89 e5                mov    %rsp,%rbp
#     114d:       bf 01 00 00 00          mov    $0x1,%edi
#     1152:       e8 f9 fe ff ff          call   1050 <exit@plt>

The left column is the memory address, the middle column is the raw machine code bytes in hex, and the right column is the human-readable assembly mnemonic.

Ghidra

Ghidra is a free, open-source reverse-engineering platform released by the NSA in 2019. It supports x86, ARM, MIPS, and many other architectures. Unlike objdump, which produces assembly, Ghidra includes a decompiler that reconstructs approximate C-like pseudocode from the binary. This makes it far easier to understand what a program does without reading thousands of assembly instructions.

Ghidra is available at ghidra-sre.org. It requires Java 17 or later.

CyberChef

For data encodings (hex, Base64, binary, URL encoding, and many more), CyberChef by GCHQ is the go-to browser-based tool. It supports chaining multiple decode operations in sequence — useful when data has been encoded multiple times (for example, a Base64 string that itself contains a URL-encoded value).

Hex editors

A hex editor lets you view any file as a raw hex dump and edit individual bytes. Popular choices include xxd (command line, ships with Vim), hexdump -C (Linux), HxD (Windows), and ImHex (cross-platform, open-source).

9. Quick Reference: Common Encoding Patterns at a Glance

Use this table as a quick cheat sheet when you encounter an unfamiliar string and need to identify its encoding type before decoding it.

EncodingVisual cluesExampleDecode with
BinaryOnly 0 and 1, groups of 801001000 01100101ASCII decoder
Hexadecimal0–9 and A–F in pairs48 65 6C 6C 6FASCII decoder
Base64A–Z, a–z, 0–9, +, /; ends with =SGVsbG8=Base64 tool
URL encoding%XX patternsHello%20WorldURL decoder
ASCII decimalNumbers 0–127 separated by spaces72 101 108 108 111ASCII decoder
HTML entities&amp;, &lt;, &#65; patterns&#72;&#101;&#108;HTML decoder

ASCII key values to memorise

ASCII was published in 1963 and defines 128 characters (0–127). Three anchor points make the rest easy to derive:

UTF-8 is the modern successor to ASCII. It is backward-compatible: any byte value below 128 in a UTF-8 file has exactly the same meaning as in ASCII. Characters above 127 are encoded as multi-byte sequences (2–4 bytes each).

10. FAQ

What is the easiest way to decode computer language for a beginner?

Start by identifying the encoding type using the visual clues in the quick-reference table above. Once you know whether you are looking at binary, hex, Base64, or URL encoding, use the corresponding tool on this site or paste the string into CyberChef. For programming-language bytecode, use dis.dis() in Python or javap -c in Java. You do not need to understand machine code to work with most encoding tasks.

What is the difference between decoding and decryption?

Decoding reverses a publicly known encoding scheme — no secret key is needed. Hex, Base64, binary, and URL percent-encoding are all encodings that anyone can reverse. Decryption, by contrast, requires a secret key. AES-256 and RSA produce ciphertext that cannot be reversed without the correct key, regardless of which tool you use. Encoding schemes like Base64 are sometimes confused with encryption because the output looks scrambled — but they provide no security whatsoever.

Can I decode compiled machine code back into readable source code?

Not perfectly. You can disassemble a compiled binary into assembly language using objdump -d, and you can decompile it to approximate C-like pseudocode using Ghidra. However, compilers discard variable names, comments, and most high-level structure during compilation. The recovered code will work similarly to the original but will not be identical to it.

What does "48 65 6C 6C 6F" mean?

It is the word "Hello" written as a hexadecimal dump. 48 hex equals 72 decimal, which is the ASCII code for H. 65 hex is 101 decimal (e), 6C is 108 decimal (l), and 6F is 111 decimal (o). Converting every pair through the ASCII table spells Hello.

Free encoding and decoding tools on this site