Skip to content

Instantly share code, notes, and snippets.

@pabloko
Created January 28, 2025 19:56
Show Gist options
  • Save pabloko/9a578039fc3b736540697439a67ba16d to your computer and use it in GitHub Desktop.
Save pabloko/9a578039fc3b736540697439a67ba16d to your computer and use it in GitHub Desktop.

​There is a lot written about windows kernel objects, but not so much for Dxgk style objects. If you ever look how windows share DXGI resources across processes, you will notice the handles created live on "dwm.exe" with the type DxgkSharedResource. There is in fact userland API to manage those objects, and it is directed to graphics drivers vendors to develop UMD drivers for graphic cards. When you create a DXGI texture, the driver ultimatelly calls D3DKMTCreateAllocation wich creates the dxgk kernel object that hold it on the system. There is two important memory pointers that those object use to hold:

  • Private Runtime Data: This is a struct defined and filled by the OS (dxgi user) and represent what the system known about the resource, like size, format..., This struct changes its elements sometimes across windows dritribs. Fun fact: you will think microsoft provides driver developers with headers for this structures, but if you peek into nvidia leaked drivers you can watch the grind around reverse engineering those structures.
  • Private Driver Data: This is a struct defined by the driver developer (nvidia/ati...) and represents what the driver vendor knows about the resource, like addresses and aligned sizes that only driver implementator and its little computer knows. Nvidia is used to modify this struct on every **** driver release.

So thats how both DXGI and driver share information at resource level. Having this in mind, using this API is simple, years ago, the header for those methods was not provided by microsoft wich required to create a load thunk, nowdays microsoft provides it under <D3dkmthk.h>

When using shared textures, there is a nice feature called keyed mutex. When you create a texture with keyed mutex support, you will see an additional kernel object DxgkSharedKeyedMutexObject appears on dwm process, unlike a normal mutex that will hold the lock condition this one also locks for a key, and set the key when unlock. To create this object the driver calls D3DKMTCreateKeyedMutex2 and use that family of API to manage them.

So what about creating standalone keyed mutex to be used as synchronization, ipc, high precision clock source... well its possible and here is a small library to do it. The resolution of the lock timeout is really precise and lightweight

The syscalls for this operations appeared on Windows Server 2008 - R2 (prior to win7) There's a second version of this API thats includes private data buffer sections (ill upload upon request), readable upon lock and writable on creation and unlock. The syscall for version 2 appeared on windows server 2012 (post win7 / pre win8)

TLDR: A keyed mutex acts as a normal mutex but apart of waiting to be unlocked, it also waits that the lock key match the internal key of the object. The internal key is set upon creation and on each unlock operation. Keyed mutex are thread-safe and can be used between different processes.

/**
* Windows Keyed Mutex Library
* @pabloko - [email protected] - 23/01/2024
* This file is a C / C++ wrapper for windows keyed mutex / events / condition variables.
**/
#include <D3dkmthk.h>
#include <windows.h>
#ifdef __cplusplus
class KeyedMutex {
private:
#endif
/**
* Create a keyed mutex and establish initial key. Use handle to perform
* operations on this mutex, and sharedHandle can be shared to other
* processes and used via km_open
*/
static HRESULT km_create(/*in*/ UINT64 initialkey, /*out*/ HANDLE* handle,
/*out*/ HANDLE* sharedhandle)
{
D3DKMT_CREATEKEYEDMUTEX kmt { NULL };
kmt.InitialValue = initialkey;
HRESULT hr = D3DKMTCreateKeyedMutex(&kmt);
if (SUCCEEDED(hr) && handle && sharedhandle) {
*handle = (HANDLE)kmt.hKeyedMutex;
*sharedhandle = (HANDLE)kmt.hSharedHandle;
}
return hr;
}
/*
* Opens an existing keyed mutex. use handle to perform operation on this mutex.
*/
static HRESULT km_open(/*in*/ HANDLE sharedhandle, /*out*/ HANDLE* handle)
{
D3DKMT_OPENKEYEDMUTEX kmt { NULL };
kmt.hSharedHandle = (D3DKMT_HANDLE)sharedhandle;
HRESULT hr = D3DKMTOpenKeyedMutex(&kmt);
if (SUCCEEDED(hr) && handle) {
*handle = (HANDLE)kmt.hKeyedMutex;
}
return hr;
}
/*
* Invalidates the handle of the mutex, that is destroyed after the last
* handle reference is destroyed.
*/
static HRESULT km_destroy(/*in*/ HANDLE handle)
{
D3DKMT_DESTROYKEYEDMUTEX kmt { NULL };
kmt.hKeyedMutex = (D3DKMT_HANDLE)handle;
return D3DKMTDestroyKeyedMutex(&kmt);
}
/*
* Lock the keyed mutex for a key. The timeout is expressed in as NT-style
* timeout. The timeout will be absolute when a positive value (timestamp)
* is used and relative when negative (interval) value is used. 0 means no
* timeout and MAXLONGLONG is sort of INFINITE.
*/
static HRESULT km_acquire(
/*in*/ HANDLE handle, /*in*/ UINT64 key, /*in*/ LARGE_INTEGER timeout)
{
D3DKMT_ACQUIREKEYEDMUTEX kmt { NULL };
kmt.hKeyedMutex = (D3DKMT_HANDLE)handle;
kmt.Key = key;
kmt.pTimeout = &timeout; // -100 nanosecond intervals
return D3DKMTAcquireKeyedMutex(&kmt);
}
/*
* Release a previously locked keyed mutex and set its key.
*/
static HRESULT km_release(/*in*/ HANDLE handle, /*in*/ UINT64 key)
{
D3DKMT_RELEASEKEYEDMUTEX kmt { NULL };
kmt.hKeyedMutex = (D3DKMT_HANDLE)handle;
kmt.Key = key;
return D3DKMTReleaseKeyedMutex(&kmt);
}
#ifdef __cplusplus
KeyedMutex(HANDLE h, HANDLE sh)
: m_handle(h)
, m_sharedhandle(sh)
{
}
public:
~KeyedMutex() { km_destroy(m_handle); }
static KeyedMutex* Create(UINT64 initialkey = NULL)
{
HANDLE handle = NULL, sharedhandle = NULL;
if (FAILED(km_create(initialkey, &handle, &sharedhandle)))
return NULL;
return new KeyedMutex(handle, sharedhandle);
}
static KeyedMutex* Open(HANDLE sharedhandle)
{
HANDLE handle = NULL;
if (FAILED(km_open(sharedhandle, &handle)))
return NULL;
return new KeyedMutex(handle, sharedhandle);
}
BOOL Lock(INT64 key, INT64 timeout = MAXLONGLONG)
{
if (m_locked)
return FALSE;
LARGE_INTEGER tm { NULL };
tm.QuadPart = timeout;
m_locked = SUCCEEDED(km_acquire(m_handle, key, tm));
return m_locked;
}
BOOL Unlock(UINT64 key)
{
if (!m_locked)
return FALSE;
m_locked = FAILED(km_release(m_handle, key));
return !m_locked;
}
BOOL TryLock(INT64 key) { return Lock(key, NULL); }
BOOL IsLocked() { return m_locked; }
HANDLE GetHandle() { return m_handle; }
HANDLE GetSharedHandle() { return m_sharedhandle; }
private:
HANDLE m_handle = NULL;
HANDLE m_sharedhandle = NULL;
volatile BOOL m_locked = FALSE;
};
#endif
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment