Secure Code Review

Semgrep

Commands: (Only works in Linux or similar distributions)

Installation:
- Create a virtual environment: python3 -m venv “my_env”
- Activate the created virtual environment:
  - Windows: source “my_env”/Scripts/Activate
  - Linux: source “my_env”/bin/activate
  - Once activated, use “pip” to download the semgrep library: pip install semgrep
    Semgrep can also be installed directly without creating any virtual environment but it’s always recommended to have a virtual environment set-up in case things go wrong, you can switch back to your normal environment and work with it.
Usage:
- Semgrep Login:
  - Command: semgrep login It'll display a prompt if your API token is not saved, click on the prompt and authorize yourself.
    Default API Token location: /home/kali/.semgrep/settings.yml.
- Run semgrep rulesets on source code repository using semgrep scan:
  - Local rulesets:
    Download offline rules registery on your local machine: Semgrep Rulesets Github Repo
    Command: semgrep –config /path/to/rulesets/ source_code/ --json-output=output_file.json
  - Online rulesets:
    Explore Semgrep online rules registry: Explore Semgrep Online Rulesets
    For Example: in order to run OWASP Top 10 rulesets on your source code repository.
    Command: semgrep --config "p/owasp-top-ten" source_code/ --json-output=output_file.json
    Run scan with default rulesets: semgrep --config auto source_code/ --json-output=output_file.json
- Run semgrep rulesets on source code repository using semgrep ci:
  - Command: cd source_code_repo/ && semgrep ci
    Once the command is completed, go to the dashboard and check the findings there. It does not upload the entire source code, only the findings and source code snippets associated with those findings.
    This command is very resource exhaustive, might take more than 90% of CPU
- Semgrep Logout:
  - Command: semgrep logout
    Note: if you have authenticated the semgrep on client's machine, make sure to logout and delete the API token saved at /home/kali/.semgrep/settings.yml..

Snyk

Commands: (Can work both in Windows and Linux)

Installation:
- Install the binary from official snyk website: https://docs.snyk.io/snyk-cli/install-or-update-the-snyk-cli
  In case, you don’t want to download the snyk binary from the above source, you can download it on the command line as well using npm.
  Command: npm install -g snyk
Usage:
- Authenticate with snyk: snyk auth <user_id>
  How to get your snyk auth id?
- Run local scans:
  - Go to the snyk dashboard and enable the “snyk code” option
  - Command to scan code locally: snyk code test --json /path/to/source_code/ > snyk_result.json
    Once this is done, you will get a json file containing results
    Example:
    
    Refer to point 3 for getting the best out of JSON report
Convert JSON report into HTML
- Download tool called snyk-to-html either from Github or using npm.
  Github: snyk-to-html
  Command Line: npm install snyk-to-html -g
- Steps to run snyk-to-html:
  - move the snyk JSON result to /path/to/source_code/
  - cd /path/to/source_code/
  - Verify if the snyk JSON file is present under the /path/to/source_code/
  - Command: snyk-to-html -i snyk-results.json -o snyk-report.html
- Sample HTML Output

Note: Make sure to regenerate your Snyk API token once you have completed your activity on the client’s machine. This ensures that your token is not left exposed and maintains the security of your account.

Bandit + Safety

Commands: (Works on Windows, Linux, and similar distributions)

Installation:
- Create a virtual environment: python -m venv myenv
- Activate the created virtual environment:
  - Windows: source myenv\Scripts\activate
  - Linux: source myenv/bin/activate
  - Once activated, use pip to download the Bandit library: pip install bandit
    Bandit can also be installed directly without a virtual environment, but using a virtual environment is recommended to avoid conflicts with other packages.
Usage:
- Run Bandit on source code:
  - To scan a Python file or directory: bandit -r /path/to/source_code/
  - Saving output of bandit scan: bandit -r /path/to/source_code/ -f json -o output-file.json (-f flag is used to specify a format, in this case result will be generated in JSON format)
  - To Exclude files or directories: bandit -r /path/to/source_code/ -x dir1,dir2,filename1
- Find Outdated and Vulnerable Packages:
  - Command: pip install safety
  - Scan requirements.txt file:
    - safety check -r requirements.txt
    - cat requirements.txt | safety check --stdin
  - Save output in a file: safety check -r requirements.txt --output json
    Note: packages should be mentioned with their respective versions in the requirements.txt file or else safety will generate the warning, and those packages will be ignored by safety

Manual Static Code Analysis

SQL Injection

Insecure Code

def get_user_info(user_id):
    query = f"SELECT * FROM users WHERE id = {user_id}" # direct use of "user_id" in the SQL statement
    execute_query(query)

_What's the issue in the above code snippet?
The user_id is directly inserted into the SQL query, which can lead to SQL injection if user_id is manipulated by an attacker.

Secure Code

def get_user_info(user_id):
    query = "SELECT * FROM users WHERE id = %s" # use of a string formatters or place-holders instead of directly using the variable
    execute_query(query, (user_id,))

Use parameterized queries to avoid SQL injection.

XSS (Cross Site Scripting)

Insecure Code

<p>User Input: <script>document.write(decodeURIComponent(location.search.substring(1)));</script></p>

What's the issue in the above code snippet?
The code directly writes user input into the HTML document without any sanitization or escaping, allowing an attacker to inject and execute arbitrary JavaScript.

<button onclick="showMessage()">Click me</button>
<script>
    function showMessage() {
        var userMessage = document.getElementById('messageInput').value;
        document.getElementById('messageDisplay').innerHTML = userMessage;
    }
</script>

What's the issue in the above code snippet?
User input is directly inserted into the HTML content without any escaping.

const template = `Hello, ${req.query.name}!`;
res.send(template);

What's the issue in the above code snippet?
This is an example of Template injection, here developers have directly inserted untrusted user input into a template can lead to XSS. Additionally, we have encountered this issue many times during the secure code review.

Improper Error Handling

Insecure Code

try:    
   perform_sensitive_operation()
except Exception as e:    # 'e' is an object that will contain meta-info. about the exception
   print(f"An error occurred: {e}") # priting the entire object in the response

What's the issue in the above code snippet?
Printing the exception message can reveal sensitive information about the internal workings of the application.

Secure Version of the aboce code snippet

try:
    perform_sensitive_operation()
except Exception:
    logging.error("An error occurred during operation", exc_info=True)
    print("An unexpected error occurred. Please try again later.")

Only expose the necessary information, not the entire exception object in the response.

Hardcoded Secrets

Insecure Code

api_key = "123456789abcdef" # hardcoded the api key
password = "dfldkfdflk" # hardcoded the password value

What's the issue in the above code snippet?
Hardcoding sensitive information like API keys in source code is a security risk.

Secure Code

api_key = os.getenv('api_key')
password = os.getenv('password')

Use environment variables in case of using sensitive credentials in the source code

Vulnerable Cryptographic Algorithms

During Secure Code Review Assessment, look out for outdated and vulnerable cryptographic algorithms present in the source code. Below given are some of the cryptographic algorithms that are considered outdated and vulnerable.

DES (Data Encryption Standard)
3DES (Triple DES)
RC4 (Rivest Cipher 4)
MD5 (Message Digest Algorithm 5)
SHA-1 (Secure Hash Algorithm 1)
RSA with a too short key (i.e. 768 bits or less)

Path Traversal Vulnerability

Insecure Code

def read_file(file_path): // file_path might contain something malicious
    # Assume `file_path` is obtained from user input
    base_directory = '/var/www/html/uploads/'
    full_path = os.path.join(base_directory, file_path) # use of file_path in this line without performing any validation or input sanitzation
    
    with open(full_path, 'r') as file:
        content = file.read()
    
    return content

What's the issue in the above code snippet?
An attacker could input something like ../../etc/passwd in file_path variable to access sensitive files outside the intended directory.

Secure version of the above code

def read_file(file_path):
    # Assume `file_path` is obtained from user input
    base_directory = '/var/www/html/uploads/'
    
    # Normalize the file path to prevent directory traversal attacks
    safe_path = os.path.normpath(file_path)
    
    # Ensure the file path is within the base directory
    if os.path.commonprefix([os.path.abspath(base_directory), os.path.abspath(os.path.join(base_directory, safe_path))]) != os.path.abspath(base_directory):
        raise ValueError("Invalid file path.")
    
    # Construct the full path
    full_path = os.path.join(base_directory, safe_path)
    
    # Ensure the file exists and is within the allowed directory
    if not os.path.isfile(full_path):
        raise FileNotFoundError("File does not exist.")
    
    with open(full_path, 'r') as file: # a secure code flow ensures that full_path will not contain any arbitrary file names
        content = file.read()
    
    return content

Few validations and checks have been performed before using the full_path in the open() method

Uncontrolled Resource Consumption

Insecure Code

def fetch_data(url):
    try:
        # Insecure: No timeout specified
        response = requests.get(url) # no time out argument is specified here
        response.raise_for_status()  # Raise an exception for HTTP errors
        return response.text
    except requests.RequestException as e:
        return f"An error occurred: {e}"

url = "https://example.com/api/data"
data = fetch_data(url)

What's the issue in the above source code?
If the server is unresponsive or very slow, the request will hang indefinitely, potentially leading to resource exhaustion on the client side.

Secure version of the above code

import requests

def fetch_data(url):
    try:
        # Secure: Specify a timeout (e.g., 10 seconds)
        response = requests.get(url, timeout=10) # use of "timeout" argument here
        response.raise_for_status()  # Raise an exception for HTTP errors
        return response.text
    except requests.RequestException as e:
        return f"An error occurred: {e}"

url = "https://example.com/api/data"
data = fetch_data(url)

Always use the "timeout" argument or parameter while using the get/post/put/delete methods of the requests class. By adding the timeout parameter to the HTTP related method like get/post/put/delete, you ensure that the request will only wait for a specified amount of time before aborting. This prevents the application from hanging indefinitely.

Use of Standard Pseudo-Random Number Generator (PRNG)

Insecure Code

import random # This library is considered as insecure

def generate_password(length):
    # Insecure: Using a standard PRNG
    chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
    password = ''.join(random.choice(chars) for _ in range(length))
    return password

# Generate a password
print(generate_password(12))

What's the inssue in the above source code?
The random module’s PRNG is not cryptographically secure. If an attacker knows the algorithm and has access to some generated values, they might predict future values or reverse-engineer the seed.

Himan10/secure_code_review.md

Secure Code Review

Semgrep

Commands: (Only works in Linux or similar distributions)

Snyk

Commands: (Can work both in Windows and Linux)

Bandit + Safety

Commands: (Works on Windows, Linux, and similar distributions)

Manual Static Code Analysis

SQL Injection

XSS (Cross Site Scripting)

Improper Error Handling

Hardcoded Secrets

Vulnerable Cryptographic Algorithms

Path Traversal Vulnerability

Uncontrolled Resource Consumption

Use of Standard Pseudo-Random Number Generator (PRNG)