Skip to content

Instantly share code, notes, and snippets.

@hugobowne
hugobowne / Research Summary on Dropout in Neural Networks
Created July 24, 2025 17:31
Comprehensive overview of recent research on dropout in neural networks including Bayesian methods, robustness, and generalization.
# Dropout in Neural Networks
This summary provides an overview of recent research findings and perspectives on dropout as a regularization technique in neural networks based on the latest arXiv papers.
## Summary of Work
Several recent studies explore the role and effects of dropout in neural networks across a range of contexts including Bayesian deep learning, model robustness, and uncertainty estimation. Dropout is commonly used to prevent overfitting by randomly dropping units during training, which helps the model generalize better.
One notable approach is the combination of dropout with Bayesian inference methods, such as Monte Carlo (MC) dropout, to quantify predictive uncertainty and improve calibration of probabilistic forecasts. For instance, an ensemble technique utilizing initial weights combined with MC dropout was shown to enhance both forecast skill and uncertainty calibration in convective initiation nowcasting.
# Small and Large Language Models
## Research Question
Understanding the distinctions, collaboration, performance, and application of small language models (SLMs) versus large language models (LLMs).
## Summary of Work
1. **Dynamic Scoring with Enhanced Semantics for Training-Free HOI Detection** explores Vision-Language models leveraging small and large features for interaction detection without training, emphasizing multi-head attention to fuse visual and textual features.
2. **DynaSearcher** proposes a dynamic knowledge graph augmented search agent using multi-reward reinforcement learning. It uses small models effectively for retrieval tasks, achieving results comparable to large models but with fewer resources.
# Small and Large Language Models
This summary provides insights from recent papers related to small and large language models (LLMs) and their applications, performance, and efficiency improvements.
## Summary of Work
1. **Dynamic Scoring with Enhanced Semantics for HOI Detection**
- Focus on Vision-Language Models (VLMs) improving human-object interaction detection without heavy training.
- Utilizes small visual cues and textual features for robust multimodal understanding.
# Summary of Research on Small Large Language Models
This summary provides an overview of recent research related to the application and development of small large language models (LLMs) or techniques leveraging small models in conjunction with larger ones to improve efficiency, accuracy, or scalability.
## Summary of Work
1. **Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection**
- This work explores the usage of vision-language models which can include smaller models for enhanced semantic understanding without extensive training, showing competitive results especially on rare interactions.
2. **DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning**
# Should I Prioritize Recall or Precision in RAG Pipelines?
## Short Description
This report reviews recent research on the trade-offs and impacts of prioritizing recall versus precision when building Retrieval-Augmented Generation (RAG) pipelines. RAG combines retrieval of relevant documents with generative models and is widely used in question answering and other NLP applications that require external knowledge integration.
## Summary of Work
Recent papers emphasize hybrid and graph-enhanced retrieval mechanisms that improve both recall and precision simultaneously. For example, BifrostRAG introduces a dual-graph retrieval structure for multi-hop QA, achieving around 92.8% precision and 85.5% recall, showing that balancing both can significantly enhance results.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hugobowne
hugobowne / df_profiler.py
Created March 2, 2022 00:31
Generates rapid, exploratory dataframe reports
import pandas as pd
import pandas_profiling
# Create small df
data = {"name": ["Hugo", "Ville"], "city": ["Sydney", "SF"]}
df = pd.DataFrame(data)
# Create report and save to file
profile = pandas_profiling.ProfileReport(df)
profile.to_file("df_report.html")
@hugobowne
hugobowne / scheme.py
Last active December 30, 2023 04:36
Dave Beazley had us implement a mini-scheme like interpreter in Python today: this is what we came up with.
# scheme.py
#
# Challenge: Can you implement a mini-scheme interpreter (program that's running another program) capable of
# executing the following code (now at bottom of file):
def seval(sexp, env):
if isinstance(sexp, (int, float)):
return sexp
elif isinstance(sexp, str): #Symbols
return env[sexp] #Evaluate symbol names in the 'env'
@hugobowne
hugobowne / tweet_listener.py
Last active October 6, 2023 18:48
NOTE: this code is for a previous version of the Twitter API and I will not be updating in the near future. If someone else would like to, I'd welcome that! Feel free to ping me. END NOTE. Here I define a Tweet listener that creates a file called 'tweets.txt', collects streaming tweets as .jsons and writes them to the file 'tweets.txt'; once 100…
class MyStreamListener(tweepy.StreamListener):
def __init__(self, api=None):
super(MyStreamListener, self).__init__()
self.num_tweets = 0
self.file = open("tweets.txt", "w")
def on_status(self, status):
tweet = status._json
self.file.write( json.dumps(tweet) + '\n' )
self.num_tweets += 1
@hugobowne
hugobowne / README.md
Last active May 22, 2020 23:34
Civic Impact through Data Visualization: Exercise 1

Join the chat at https://gitter.im/Jay-Oh-eN/data-scientists-guide-apache-spark

These are the materials for my workshop on creating interactive data visualizations with D3! We will be using the following two tools to works through these exercises:

And please do not hesitate to reach out to me directly via email at [email protected] or over twitter @clearspandex

Throughout this workshop, you will learn how to make an interactive map of AirBnB listings in SF to better understand the companies impact on the city.