-
-
Save AndreVallestero/b08559cdc689d22587f6cf483e87e30f to your computer and use it in GitHub Desktop.
import win32gui, win32ui | |
from win32con import SRCCOPY | |
from numpy import fromstring | |
''' | |
Optimized to be 6 times faster using the following techniques | |
- Reuse bitmaps, handles, and device contexts | |
- Use the application framebuffer instead of the compositor frame buffer(entire desktop) | |
This is not the fastest method. That would be to directly copy the data from the GPU back buffer | |
- https://web.archive.org/web/20121205062922/http://www.ring3circus.com/blog/2007/11/22/case-study-fraps/ | |
''' | |
class FrameGrabber(): | |
def __init__(self, x: float, y: float, w: float, h: float, windowTitle: str = ""): | |
self.hwnd = win32gui.FindWindow(None, windowTitle) if windowTitle else win32gui.GetDesktopWindow() | |
win_x1, win_y1, win_x2, win_y2 = win32gui.GetWindowRect(self.hwnd) | |
win_w = win_x2 - win_x1 | |
win_h = win_y2 - win_y1 | |
self.pos = ( | |
round(x * win_w if 0 < x < 1 else x), | |
round(y * win_h if 0 < y < 1 else y) | |
) | |
self.w = round(w * win_w if 0 < w < 1 else w) | |
self.h = round(h * win_h if 0 < h < 1 else h) | |
self.hwnddc = win32gui.GetWindowDC(self.hwnd) | |
self.hdcSrc = win32ui.CreateDCFromHandle(self.hwnddc) | |
self.hdcDest = self.hdcSrc.CreateCompatibleDC() | |
self.bmp = win32ui.CreateBitmap() | |
self.bmp.CreateCompatibleBitmap(self.hdcSrc, self.w, self.h) | |
self.hdcDest.SelectObject(self.bmp) | |
def grab(self): | |
self.hdcDest.BitBlt((0, 0), (self.w, self.h), self.hdcSrc, self.pos, SRCCOPY) | |
img = fromstring(self.bmp.GetBitmapBits(True), dtype='uint8') | |
img.shape = (self.h ,self.w, 4) | |
# To convert to RGB, use cv2.cvtColor(img, cv2.COLOR_BGRA2RGB) | |
# This is often unnecessary if simple image filtering is being done | |
return img | |
def __del__(self): | |
self.hdcSrc.DeleteDC() | |
self.hdcDest.DeleteDC() | |
win32gui.ReleaseDC(self.hwnd, self.hwnddc) | |
win32gui.DeleteObject(self.bmp.GetHandle()) | |
Optimizing this grabbing is super important since this is the first step and it dictates the processing speed of literally everything else after it, so whatever is the quickest way, I am all ears :D
For the absolute fastest way, I believe copying the data directly from the GPU back buffer is the fastest
I believe this is the technique that Fraps, Nvidia Shadowplay, and Windows Game DVR uses for high quality/frame rate recording with minimal resource usage. I've never attempted this myself but I might give it a go in python if I have some free time over the weekend.
For the absolute fastest way, I believe copying the data directly from the GPU back buffer is the fastest
Did some more digging and it seems like someone has been able to do what I mentioned.
https://github.com/SerpentAI/D3DShot
Unfortunately it's a little out of date and unable to compile Pillow 7 for me on Python 3.9 + Windows, hopefully other people have better luck.
I downgraded to python 3.8 to get D3DShot working but unfortunately, the performance was less than stellar
start_time = time()
for _ in range(1000):
img = grab_screen_old((0,0,512,512))
print(time() - start_time)
# 7.05
start_time = time()
fg = FrameGrabber(0, 0, 512, 512, "Counter-Strike: Global Offensive")
for _ in range(1000):
img = fg.grab()
print(time() - start_time)
# 1.68
start_time = time()
d = d3dshot.create(capture_output="numpy")
for _ in range(1000):
img = d.screenshot((0, 0, 512, 512))
print(time() - start_time)
# 21.48
I haven't taken the time to look through D3DShot's implementation but I'm very surprised to see it perform so poorly. I'll continue to see if I can make my own hardware accelerated implementation and report back my findings.
I did some testing and found that the fastest approach to this problem was indeed to copy from the GPU back-buffer. There's an article that goes into great detail on the subject here:
However, doing this approach entirely in Python would likely result in a performance hit and would be quite difficult considering the necessity for granular access of low level system and hardware calls. Maybe in the future I'll consider making a lib in C with bindings for python that is able to do the aforementioned work, but until then, I'll be putting this project on pause until I have some more time on my hands.
Thanks! will compare to one of our other updated scripts. I thought I was on the latest version of our screen grabber, but wasn't.
Will also take a peak into the hardware acceleration. Optimizing this grabbing is super important since this is the first step and it dictates the processing speed of literally everything else after it, so whatever is the quickest way, I am all ears :D