-
Installation:
- Create a virtual environment:
python3 -m venv “my_env”
- Activate the created virtual environment:
- Windows:
source “my_env”/Scripts/Activate
- Linux:
source “my_env”/bin/activate
- Once activated, use “pip” to download the semgrep library:
pip install semgrep
Semgrep can also be installed directly without creating any virtual environment but it’s always recommended to have a virtual environment set-up in case things go wrong, you can switch back to your normal environment and work with it.
- Windows:
- Create a virtual environment:
-
Usage:
-
Semgrep Login:
- Command:
semgrep login
It'll display a prompt if your API token is not saved, click on the prompt and authorize yourself.
Default API Token location:/home/kali/.semgrep/settings.yml.
- Command:
-
Run semgrep rulesets on source code repository using semgrep scan:
-
Local rulesets:
Download offline rules registery on your local machine: Semgrep Rulesets Github Repo
Command:semgrep –config /path/to/rulesets/ source_code/ --json-output=output_file.json
-
Online rulesets:
Explore Semgrep online rules registry: Explore Semgrep Online Rulesets
For Example: in order to run OWASP Top 10 rulesets on your source code repository.
Command:semgrep --config "p/owasp-top-ten" source_code/ --json-output=output_file.json
Run scan with default rulesets:semgrep --config auto source_code/ --json-output=output_file.json
-
-
Run semgrep rulesets on source code repository using semgrep ci:
- Command:
cd source_code_repo/ && semgrep ci
Once the command is completed, go to the dashboard and check the findings there. It does not upload the entire source code, only the findings and source code snippets associated with those findings.
This command is very resource exhaustive, might take more than 90% of CPU
- Command:
-
Semgrep Logout:
- Command:
semgrep logout
Note: if you have authenticated the semgrep on client's machine, make sure to logout and delete the API token saved at/home/kali/.semgrep/settings.yml.
.
- Command:
-
-
Installation:
- Install the binary from official snyk website: https://docs.snyk.io/snyk-cli/install-or-update-the-snyk-cli
In case, you don’t want to download the snyk binary from the above source, you can download it on the command line as well usingnpm
.
Command:npm install -g snyk
- Install the binary from official snyk website: https://docs.snyk.io/snyk-cli/install-or-update-the-snyk-cli
-
Usage:
-
Convert JSON report into HTML
-
Download tool called
snyk-to-html
either from Github or using npm.
Github: snyk-to-html
Command Line:npm install snyk-to-html -g
-
Steps to run snyk-to-html:
- move the snyk JSON result to /path/to/source_code/
- cd /path/to/source_code/
- Verify if the snyk JSON file is present under the /path/to/source_code/
- Command:
snyk-to-html -i snyk-results.json -o snyk-report.html
-
Note: Make sure to regenerate your Snyk API token once you have completed your activity on the client’s machine. This ensures that your token is not left exposed and maintains the security of your account.
-
Installation:
- Create a virtual environment:
python -m venv myenv
- Activate the created virtual environment:
- Windows:
source myenv\Scripts\activate
- Linux:
source myenv/bin/activate
- Once activated, use pip to download the Bandit library:
pip install bandit
Bandit can also be installed directly without a virtual environment, but using a virtual environment is recommended to avoid conflicts with other packages.
- Windows:
- Create a virtual environment:
-
Usage:
-
Run Bandit on source code:
- To scan a Python file or directory:
bandit -r /path/to/source_code/
- Saving output of bandit scan:
bandit -r /path/to/source_code/ -f json -o output-file.json
(-f flag is used to specify a format, in this case result will be generated in JSON format) - To Exclude files or directories:
bandit -r /path/to/source_code/ -x dir1,dir2,filename1
- To scan a Python file or directory:
-
Find Outdated and Vulnerable Packages:
- Command:
pip install safety
- Scan requirements.txt file:
safety check -r requirements.txt
cat requirements.txt | safety check --stdin
- Save output in a file:
safety check -r requirements.txt --output json
Note: packages should be mentioned with their respective versions in the requirements.txt file or else safety will generate the warning, and those packages will be ignored by safety
- Command:
-
Insecure Code
def get_user_info(user_id):
query = f"SELECT * FROM users WHERE id = {user_id}" # direct use of "user_id" in the SQL statement
execute_query(query)
_What's the issue in the above code snippet?
The user_id is directly inserted into the SQL query, which can lead to SQL injection if user_id is manipulated by an attacker.
Secure Code
def get_user_info(user_id):
query = "SELECT * FROM users WHERE id = %s" # use of a string formatters or place-holders instead of directly using the variable
execute_query(query, (user_id,))
Use parameterized queries to avoid SQL injection.
Insecure Code
<p>User Input: <script>document.write(decodeURIComponent(location.search.substring(1)));</script></p>
What's the issue in the above code snippet?
The code directly writes user input into the HTML document without any sanitization or escaping, allowing an attacker to inject and execute arbitrary JavaScript.
<button onclick="showMessage()">Click me</button>
<script>
function showMessage() {
var userMessage = document.getElementById('messageInput').value;
document.getElementById('messageDisplay').innerHTML = userMessage;
}
</script>
What's the issue in the above code snippet?
User input is directly inserted into the HTML content without any escaping.
const template = `Hello, ${req.query.name}!`;
res.send(template);
What's the issue in the above code snippet?
This is an example of Template injection, here developers have directly inserted untrusted user input into a template can lead to XSS. Additionally, we have encountered this issue many times during the secure code review.
Insecure Code
try:
perform_sensitive_operation()
except Exception as e: # 'e' is an object that will contain meta-info. about the exception
print(f"An error occurred: {e}") # priting the entire object in the response
What's the issue in the above code snippet?
Printing the exception message can reveal sensitive information about the internal workings of the application.
Secure Version of the aboce code snippet
try:
perform_sensitive_operation()
except Exception:
logging.error("An error occurred during operation", exc_info=True)
print("An unexpected error occurred. Please try again later.")
Only expose the necessary information, not the entire exception object in the response.
Insecure Code
api_key = "123456789abcdef" # hardcoded the api key
password = "dfldkfdflk" # hardcoded the password value
What's the issue in the above code snippet?
Hardcoding sensitive information like API keys in source code is a security risk.
Secure Code
api_key = os.getenv('api_key')
password = os.getenv('password')
Use environment variables in case of using sensitive credentials in the source code
During Secure Code Review Assessment, look out for outdated and vulnerable cryptographic algorithms present in the source code. Below given are some of the cryptographic algorithms that are considered outdated and vulnerable.
DES (Data Encryption Standard)
3DES (Triple DES)
RC4 (Rivest Cipher 4)
MD5 (Message Digest Algorithm 5)
SHA-1 (Secure Hash Algorithm 1)
RSA with a too short key (i.e. 768 bits or less)
Insecure Code
def read_file(file_path): // file_path might contain something malicious
# Assume `file_path` is obtained from user input
base_directory = '/var/www/html/uploads/'
full_path = os.path.join(base_directory, file_path) # use of file_path in this line without performing any validation or input sanitzation
with open(full_path, 'r') as file:
content = file.read()
return content
What's the issue in the above code snippet?
An attacker could input something like ../../etc/passwd
in file_path
variable to access sensitive files outside the intended directory.
Secure version of the above code
def read_file(file_path):
# Assume `file_path` is obtained from user input
base_directory = '/var/www/html/uploads/'
# Normalize the file path to prevent directory traversal attacks
safe_path = os.path.normpath(file_path)
# Ensure the file path is within the base directory
if os.path.commonprefix([os.path.abspath(base_directory), os.path.abspath(os.path.join(base_directory, safe_path))]) != os.path.abspath(base_directory):
raise ValueError("Invalid file path.")
# Construct the full path
full_path = os.path.join(base_directory, safe_path)
# Ensure the file exists and is within the allowed directory
if not os.path.isfile(full_path):
raise FileNotFoundError("File does not exist.")
with open(full_path, 'r') as file: # a secure code flow ensures that full_path will not contain any arbitrary file names
content = file.read()
return content
Few validations and checks have been performed before using the full_path
in the open() method
Insecure Code
def fetch_data(url):
try:
# Insecure: No timeout specified
response = requests.get(url) # no time out argument is specified here
response.raise_for_status() # Raise an exception for HTTP errors
return response.text
except requests.RequestException as e:
return f"An error occurred: {e}"
url = "https://example.com/api/data"
data = fetch_data(url)
What's the issue in the above source code?
If the server is unresponsive or very slow, the request will hang indefinitely, potentially leading to resource exhaustion on the client side.
Secure version of the above code
import requests
def fetch_data(url):
try:
# Secure: Specify a timeout (e.g., 10 seconds)
response = requests.get(url, timeout=10) # use of "timeout" argument here
response.raise_for_status() # Raise an exception for HTTP errors
return response.text
except requests.RequestException as e:
return f"An error occurred: {e}"
url = "https://example.com/api/data"
data = fetch_data(url)
Always use the "timeout" argument or parameter while using the get/post/put/delete methods of the requests class. By adding the timeout parameter to the HTTP related method like get/post/put/delete, you ensure that the request will only wait for a specified amount of time before aborting. This prevents the application from hanging indefinitely.
Insecure Code
import random # This library is considered as insecure
def generate_password(length):
# Insecure: Using a standard PRNG
chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
password = ''.join(random.choice(chars) for _ in range(length))
return password
# Generate a password
print(generate_password(12))
What's the inssue in the above source code?
The random module’s PRNG is not cryptographically secure. If an attacker knows the algorithm and has access to some generated values, they might predict future values or reverse-engineer the seed.