A3 Design Doc

Rottweiler OS161

Stella Pantella - Roger Zurawicki

Questions

How do we set executable bits? everything is always readable, writable, executable
What is the point of the daemon. Is there a way to know when a page has been updated? dirty bit

Data Structures

Coremap

// max_pid is 64 = 2^6
struct cm_entry {
    uint32_t tlb_hi;
    // = TLB_HI[1/2]   32 bits
    // |-- VPN         20 bits
    // |-- pid         10 bits (0 for kernel, with valid set)
    // |-- valid       1 bit
    // |-- reserved    1 bit  (when rserved cannot evict)

    // for fine-grained implementation
    // unsigned lockbit;
};

Page Table

We assume one level constant size pagetable. An entry in the page table should have a valid spn.

// 64-bit struct
struct pt_entry {
    uint32_t tlb_lo;
    // = TLB_LO[1/2]    32 bits
    // |-- PPN               20 bits
    // |-- mode               4 bits
    // | |-- nocache          1 bit
    // | |-- dirty            1 bit
    // | |-- valid            1 bit
    // | |-- global (kernel)  1 bit
    // |-= TLB reserved       8 bits
    // | |-- swapped          1 bit
    // | |-- (unused)         3 bits
    // | |-- cpu_id           4 bits

    // swap index, points to address on swap disk = spn * PAGE_SZ
    spn_t spn;  // 32 bits
};

Process-dependent structures

// lock is per pagetable or per addrspace?
struct addrspace {
  pte_array stack_pt;
  pte_array heap_pt;
  vaddr_t heap_end;
};

## Kernel-dependent structures
```c
// Corepage size is fixed at boot-time = USERRAM_SZ/PAGE_SZ
// sizeof(struct coremap)
// 512 KB RAM -> 512+  B coremap
// 512 MB RAM -> 512+ KB coremap
// We use the clock LFU approximation algorithm
// points to the index of the map that will be a candidate for eviction
// if used we evict it and update the clock_hand
// do not use daemon index and clock index
struct coremap {
    struct cm_entry *cm_elts;
    size_t cm_sz;
    int clock_hand;  // used by eviction algorithm
    size_t daemon_idx;  // writing daemon to write pages to disk
    struct spinlock *cm_lk;  // cpu cannot go to sleep
};

Swapdisk Bitmap

We are not lazy about swapping. When the user calls malloc we allocate space on the swapdisk. We do not allocate space for kmallocs. If kmalloc is called, the area alloced is fixed on the coremap and will never change. If there are not n contiguous pages on swapdisk for a user malloc, then we fail.

// Bitmap size is set on bootstrap
// 1 MB disk -> 128+ b = 16+ B bitmap max size
// 4 GB disk -> 1+ Mb = 128+ KB bitmap max size
struct swapdisk_bitmap {
    struct bitmap bm;
    struct spinlock *bm_lk;
    struct vnode *bm_vn;
};

Macros and Interface prototypes

Note: MACROS use the actual struct, NOT a pointer.

Note: All functions are thread-unsafe, less noted otherwise.

#define CME_PPN(cme)            ((ppn_t)(&cme - &CME[0]))
#define CME_VPN(cme)            ((vpn_t)((cme).tlb_hi >> 12))
#define CME_PID(cme)            ((cme).tlb_hi & 0xfff)

#define PTE_PPN(pte)            ((ppn_t)((pte).tlb_lo >> 12))
#define PTE_ISNOCACHE(pte)      ((pte).tlb_lo >> 11 & 1)
#define PTE_ISDIRTY(pte)        ((pte).tlb_lo >> 10 & 1)
#define PTE_ISVALID(pte)        ((pte).tlb_lo >> 9 & 1)
#define PTE_ISKERNEL(pte)       ((pte).tlb_lo >> 8 & 1)
#define PTE_ISSWAPPED(pte)      ((pte).tlb_lo >> 7 & 1)
#define PTE_CPU(pte)            ((pte).tlb_lo & 0xf)

typedef uint64_t tlb_t;   // tlb_t should be 64 bits
typedef uint32_t ppn_t;   // Physcial page number type
typedef uint32_t vpn_t;   // Virtual Page Number tyle
typedef uint32_t spn_t;   // Swap Page Number Type


#define OFFSET_BITS 20
#define PN_OF_ADDR(ptr)         ((ptr) >> OFFSET_BITS)
#define ADDR_OF_PN(pn)          ((pn) << OFFSET_BITS)

#define CME(ppn)                (coremap->cm_elts[(ppn)])

/*
 * Helper functions
 */

/** Gets the pagetable entry for a given process and virtual address
 * TODO: Change in case we add support for regions
 */
inline struct pt_entry *pte_proc_vaddr(struct proc *p, vaddr_t ptr) {
  return &p->p_as.pt.pt_elts[PN(ptr)];
}

/** Looks up pagetable entry from coremap entry
 * @param cme[in] coremap entry to lookup
 * @return pagetable entry of page
 * @pre page is mapped in RAM by an active process
 */
inline struct pt_entry *cme_to_pte(struct cm_entry *cme) {
    struct proc *p = procvector->procs[cme->owner];
    struct pagetable *pt = p->p_as.pt;
    return pt.pt_elts[CME_VPN(cme)];
}

/** Looks up coremap entry from pagetable entry
 * @param[in] pte pagetable entry to lookup
 * @return coremap entry of page
 * @pre page is mapped in RAM by an active process
 */
inline struct cm_entry *pte_to_cme(struct pt_entry *pte) {
    return CME(PTE_PPN(*pte));
}

/** Convert page mapping to TLB entry
 * @param[in] cme coremap entry of page
 * @param[in] pte pagetable entry of page
 * @return TLB entry from @pte and @cme
 * @pre the page is active and mapped by @pte to @cme
 */
tlb_t tlb_entry(struct cm_entry *cme, struct pt_entry *pte) {
    tlb_t retval;
    uint32_t tlb_lo = pte->tlb_lo & ~(0xff);  // clears the reserve bits
    uint32_t tlb_hi = cme->tlb_hi & ~(0x3f);  // clears the reserve bits

  // TODO we might want to handle the ASID

    join32to64(&tlb_hi, &tlb_lo, &retval);
    return retval;
}

/*
 * VM mapping functions
 */

/** Gets an open coremap page, evicting if necessary
 * @return coremap entry of available page or NULL
 *
 * Will evict page as necessary
 * May block on I/O
 */
struct cm_entry *alloc_cme() {
  // TODO lock coremap
  // TODO use clock to find nearest free page

  // TODO if clock wraps around and cannot find free page

  // else manually evict
  page_deactivate(cme, pte);

  // TODO update coremap and return cme
}


/** Updates coremap to map physical page entry to virtual page
 * @param[in,out] coremap entry to update
 * @param[in] ptr virtual address of page to map
 * @param[in] owner written to coremap entry
 * @return physical address that @ptr maps to
 * @pre @ptr is not mapped in RAM
 * @pre @cme is inactive
 * @post coremap refects new mapping
 */
inline paddr_t cm_map(struct cm_entry *cme, vaddr_t ptr, pid_t owner) {
    cme->tlb_hi = PN_OF_ADDR(ptr);
    cme->tlb_hi <<= 12;
    cme->tlb_hi |= (owner & 0xfff);

    return CME_PPN(cme) | (ptr & 0xfff);
}

/** Grabs physical memory to map virtual page
 * @param[in] pid user process id, 0 means kernel
 * @param[in] ptr virtual address of page to map
 * @param[in] writeable false if page is readonly
 * @return physical address that @ptr maps to
 * @pre @ptr is not mapped in RAM
 * @pre @ptr is not swapped to disk
 * @pre Process @pid is active or equals 0
 * @post coremap refects new mapping
 * @post Process @pid's pagetable reflect new mapping
 *
 * Will evict page as necessary
 * May block on I/O
 */
paddr_t page_create(pid_t pid, vaddr_t ptr, int writable) {
    struct pt_entry *pte = pte_proc_vaddr(PROC(pid), ptr);
    struct cm_entry *cme = alloc_cme();

    ppn_t ppn = CM_PPN(cme);
    pte->tlb_lo = ppn << 12

    // set valid bit
    pte->tlb_lo |= (1 << 9);

    if (writeable) {
      pte->tlb_lo |= (1 << 10);
    }

  if (pid == 0) {
      // set global bit
      pte->tlb_lo |= (1 << 8);
    }

    // no need to save CPU, that info is in the proccess's kthread

  pte->owner = pid;

    return cm_map(cme, ptr, pid);
}

/** Sets up coremap to map virtual page
 * @param[in] pid user process id, 0 means kernel
 * @param[in] ptr virtual address of page to map
 * @return physical address that @ptr maps to
 * @pre @ptr is not mapped to RAM
 * @pre @ptr is swapped to disk
 * @pre Process @pid is active or equals 0
 * @post coremap refects new mapping
 * @post Process @pid's pagetable reflect new mapping in RAM
 *
 * Will evict page as necessary
 * May block on I/O
 */
paddr_t page_reactivate(pid_t pid, vaddr_t ptr) {
    return cm_map(alloc_cme(), ptr, pid);
}


/** Removes a page from memory, writing to swap as neccessary
 * @param[in,out] cme coremap entry of page to deactivate
 * @param[in,out] pte coremap entry of page to deactivate
 * @pre @pte belongs to active process
 * @pre @cme's page is active and mapped by owner's pagetable
 * @post physical page is filled with deadbeef
 *
 * May block on I/O
 */
void page_deactivate(struct cm_entry *cme, struct pt_entry *pte) {
  if (PTE_ISDIRTY(pte)) {
      KASSERT(PTE_ISSWAPPED(pte));
        swapout(CME_PPN(&cme), pte->spn);
    }

    pte->tlb_lo = 0;
    pte->spn = 0;


    cme->tlb_hi = 0;
    ppn_t ppn = CME_PPN(cme);
    fill_deadbeef(cme, PAGE_SZ);

}

Synchronization Policy

Lock ordering

We interact wth synchonization primitvies by the following order. We order by increasing acquisition pressure so that the critical section for high pressure synchronization primitives have smaller critical sections.

Locks (sorted by ascending acquisiton pressure)
- I/O locks (TBD, this might be too impractical to enforce)
- Pagetable Locks
- Coremap locks ...
- Big Locks
- In case of tie, ascending address in memory
Spinlocks
- Ordered by ascending address in memory
Atomic CAS, TAS, and other instructions

We can only acquire lock in ascending order (top-down). We free locks as necessary to enforce this invariant.

Coremap

If we use a lock for the coremap, it is necessary to acquire the lock for writes. It is sufficient to acquire the lock for reads, but is it necessary?

If we have a lockbit per core map element we can use CAS to lock the entry every time we need it.

Pagetable

Because each process has its own pagetable, this is not quite the same synchonization problem as airballoons.

Roger: We could have a biglock for all pagetables. I just do not know if that is preferable or not.

A race condition we need to watch out for is the following:

Scenario

Assume TLB and memory are full

Page X and Y have the same VPN.

Process A has a TLB miss for an active, valid page (Page X) (in coremap and pagetable)

Process B has a TLB miss for an inactive, valid page (Page Y) swapped to disk.

The Race

Process B evicts page X from memory to swap disk (write to A's pagetable & coremap).

Process B installs page Y in memory (write to B's pagetable & coremap).

Process B installs the TLB entry for page Y (read B's pagetable).

Process A sets the TLB to map page X (read A's pagetable). (Is page X active?)

Process B now issues a read to page X when it should have read page Y.

Process

Following from A2, we use p->p_lk to protect process structures (excluding) VM.

Swap Data Structure

We lock the swap disk wiht its lock.

Spin Counting Semaphore

struct spin_semaphore* spin_sem_create(const char* name, unsigned initial_count);

void spin_sem_destroy(struct spin_semaphore* sem);

void spin_P(struct spin_semaphore* sem) {
  KASSERT(sem != NULL);

  /* Use the semaphore spinlock to protect the wchan as well. */
  spinlock_acquire(&sem->sem_lock);
  // how about partial reads?
  while (sem->sem_count == 0) {
  spinlock_release(&sem->sem_lock);
    spinlock_acquire(&sem->sem_lock);

  }
  KASSERT(sem->sem_count > 0);
  sem->sem_count--;
  spinlock_release(&sem->sem_lock);
}

void spin_V(struct spin_semaphore* sem) {
  KASSERT(sem != NULL);

  spinlock_acquire(&sem->sem_lock);

  sem->sem_count++;
  KASSERT(sem->sem_count > 0);

  spinlock_release(&sem->sem_lock);
}

TLB Management

Context Switch

On switching threads we call as_activate, which should call vm_tlbflush (defined in "vm.h")

/** Flushes all tlb entries in the current cpu. Disables interrupts
 *
 *  @post all tlb entries are invalidated for the current cpu
 *
 *  Atomic at Machine level
 */
void vm_tlbflush(void) {
  int spl = splhigh();
  for (int i = 0; i < NUM_TLB; i++)
    tlb_write(TLBHI_INVALID(i), TLBLO_INVALID(), i);
  splx(spl);
}


/** Inserts entry in tlb, evicts entry if TLB is full
 *
 * @param[in] entryhi
 * @param[in] entrylo
 * @post all tlb entires across processors are valid
 * @pre @a entryhi and @a entrylo are valid
 *
 *  Atomic at Machine level
 */
void vm_tlbinsert(uint32_t entryhi, uint32_t entrylo) {
  int spl = splhigh();
  uint32_t ehi;
  uint32_t elo;

  // search for invalid entry
  for (int i = 0; i < NUM_TLB; i++) {
    tlb_read(&ehi, &elo, i);
    if (elo & TLBLO_VALID)
      continue;
    tlb_write(entryhi, entrylo, i);
    goto done;
  }

  // no invalid entry found, we need to evict by choosing random entry
  // TODO: init rand in main.c
  int idx = rand() % NUM_TLB;
  tlb_read(ehi, elo);
  // unmap that entry from the pagetable
  tlb_write(entryhi, entrylo, idx);

  done:
    splx(spl);
  return 0;
}

TLB Fault

/** Handles TLB fault
  * @param[in] faulttype type of fault
  * @param[in] ptr pointer to virtual address
  * @return 0 on success, errno descriptor on failure
  * @pre @a ptr is  page aligned
  * @pre page is not in TLB
  * Atomic at thread level
  */
int vm_fault(int faulttype, vaddr_t ptr) {

    int err = 0;
    int spl;
    uint32_t entryhi, entrylo;
    paddr_t paddr = 0;

    struct pt_entry *pte = pte_proc_vaddr(curproc, ptr);

    // check permissions based on region and however we decide to do that
    switch (faulttype) {
      case VM_FAULT_READONLY:
        /* We always create pages read-write, so we can't get this */
        panic("dumbvm: got VM_FAULT_READONLY\n");
      case VM_FAULT_READ:
      case VM_FAULT_WRITE:
        break;
      default:
        return EINVAL;
    }
   
    
    lock(PT);
	struct cm_elt_t *cme = ptr_to_cme(ptr);
    // page is in coremap
    if (cme != NULL) {
      lock(cm(ppn))
      update_tlb
      unlock(cm(ppn))
      unlock(PT)
    } else { // page is not in coremap
      lock(cm(i=1))
      
    
    
    }
    
    

    

      if (err) {
        err = EFAULT;
        goto done;
      }
    

    // at this point we have a valid physical address that cannot be swapped out
    paddr = ptr_to_paddr(ptr);

    // add entry to TLB
    entryhi = ptr;
    // TODO: set dirty bit right
    entrylo = paddr | TLBLO_DIRTY | TLBLO_VALID;
    // insert into tlb with interrupts off
    err = vm_tlbinsert(entryhi, entrylo);
    // unrserve coremap entry - n wo need to hold the lock

    done:
      return print_err(err);
}

Pagetable Management

/** Returns free coremap page, either by evicting or returning that page
  * @post we have the lock for that coremap entry on return
  * Atomic at thread level
  */
ppn_t get_cm_page(void) {

  ppn_t ppn;

  for (ppn = 0; ppn < coremap->sz; ++ppn) {
    coremap_lock_acquire(ppn));
    if (!IS_FREE(ppn) || IS_KERNEL(ppn))
      continue;
    else {
      cme_reserve(cme);
      break;
    }
    coremap_lock_release(coremap_lk(ppn));
  }

  if (ppn == coremap->sz) {
    ppn = get_cm_clock_idx();

    // grabs the coremap page lock
    err = swapcopyout(cm_slot, sidx);

    pagetable_lock_acquire(pagetable_lk);

    // sends the message across all cpus
    ipi_tlbshootdown();
    for (i in range(n_cpus)) {
      spin_P();
    }

    // invalidate entry in page table


    pagetable_lock_release(pagetable_lk);
    return ppn;
  }
}



/** Handles Page fault
  * @param[in] pte page table entry
  * @return 0 on success, errno descriptor on failure
  * @pre page is not in memory
  * @pre page is a user page
  *
  * Atomic at thread level
  */
int page_handler(struct pt_entry *pte) {
  int err;
  ppn_t ppn;

  // if no empty slot, run paging daemon
  // reserve the slot and grab the lock
  if (ppn == coremap->sz) {
    ppn = evict_reserve_cm_page();
    // when this returns we have the page lock
  }
  // now we have a valid empty slot -- OTHER PROC SHOULD NOT EVICT IT
  // while we are setting this up, another process cannot come and evict
  // our page because we will setup incosistent things
  lock_pagetable(vpn);
  if (page is not in swap)
    return EFAULT;
  if (page is in swap)
    err = swapcopyin(cm_slot, sidx);
  lock_release_pagetable(vpn);

  // update VPN -> PPN in curprocs page table
  return print_err(err);
}

Dynamic Pagetable management

We are using a dynamically allocated page table array for every address space. The odd indexes of the array correspond to heap v. addresses and the even indexes of the array correspond to stack v. addresses. That way we do not need to use 2-level page tables.

vpn_to_pt_idx(x) = if even then (x/2) else (n-(x/2 + 1)
pt_idx_to_vpn(x) = if x < HALF then (2x) else (2(x+n)-1)


(  0  ,  1  ,  2  , ... , n-2 , n-1 )
maps =>
(  0  , n-1 ,  1  , n-2 ,  2  , ... )

Swap Management

The main data structure used for swap management is the "swapdisk_bitmap". It has a global lock and uses the "lhdraw0:" disk.

/** Swaps out updates the contents of the swap page on disk
 *  @param[in] ppn  physical page number to be swapped out
 *  @param[in] sidx swap space index
 *  @param[in] npages number of pages
 *  @pre assumes the page is dirty
 *
 *  @pre assumes the page has spn @sidx
 *
 *  thread-safe with respect to swapdisk
 */
void
swapcopyout(ppn_t ppn, spn_t sidx, unsigned npages) {
  struct uio u;
  struct iovec iov;

  iov.iov_ubase = ADDR_OF_PN(ppn);
  iov.iov_len = PAGE_SZ;
  u.uio_iov = &iov;
  u.uio_iovcnt = 1;
  u.uio_resid = PAGE_SZ;
  u.uio_segflg = UIO_USERSPACE;
  u.uio_rw = UIO_WRITE;
  u.uio_space = curproc->p_addrspace;
  u.uio_offset = sidx * PAGE_SZ;

  lock_acquire(swapdisk_bitmap->bm_lk);
  VOP_WRITE(swapdisk->bm_vn, uio);
  lock_release(swapdisk_bitmap->bm_lk0;
}

/** set memory contents to that of swap space
 *  @param[in] ppn  physical page number to be swapped out
 *  @param[in] sidx swap space index
 *  @param[in] npages number of pages
 *
 *  @pre @ppn belongs to the process and is already alloced
 *  @pre assumes there is a page to cpy from swap space to @pp
 *  thread-safe with respect to swapdisk
 */
void
swapcopyin(ppn_t *ppn, spn_t sidx, unsigned npages) {
    struct uio u;
    struct iovec iov;

    iov.iov_ubase = ADDR_OF_PN(ppn);
    iov.iov_len = PAGE_SZ;
    u.uio_iov = &iov;
    u.uio_iovcnt = 1;
    u.uio_resid = PAGE_SZ;  // amount to read from the file
    u.uio_segflg = UIO_USERSPACE;
    u.uio_rw = UIO_WRITE;
    u.uio_space = curproc->p_addrspace;

    // set the position to write to based on the idx
    u.uio_offset = sidx * PAGE_SZ;

    lock_acquire(swapdisk_bitmap->bm_lk);
    VOP_READ(swapdisk->bm_vn, uio);
    lock_release(swapdisk_bitmap->bm_lk0;
}

/** Finds free n contiguous swap pages in disk
 *  @param[in] npages on disk
 *  @return -1 for no space, starting index for most pages
 *  @post updates swap bitmap to reflect new relationship
 */
spn_t
swapalloc(int npages) {
   // lock bitmap
   // check if npages exist in the bitmap, mark them as used
   // unlock
   return starting index;

}

/** Frees npages of swap starting at sidx
 *  @param[in] sidx starting idx
 *  @param[in] npages number of pages to free
 *  @return Does not fail
 */
void
swapdealloc(spn_t sidx, int npages) {
   // lock bitmap
   // unset the bits from sidx to sidx + npages - 1
   // TODO: should we bezo pages on swapdisk?
   // unlock
}

Daemon

This is a thread forked on startup and specified in vm_daemon_thread.

File IO

Bootstrapping

On vm_bootstrap we need to create a coremap that will keep track of the empty and allocated physical pages for us. The coremap will be allocated at the first few spaces of our vm space which will not be virtually mapped.

If the total size of the vm space available to us is TOTAL_SZ, then assume we have c entries in the coremap: c * (PAGE_SZ + sizeof(cm_entry)) = TOTAL_SZ

Our coremap will occupy the first

sizeof(struct cm_entry) * c bytes.

// vm.h
// flag to check whether vm has been initialized, our implementation of kmalloc depends on this
bool vm_initialized;


// ram.h
/*
 * Called very early in system boot to figure out how much physical
 * RAM is available.
 */

static
ram_bootstrap() {
  + vm_initialized = false;
}

// vm.h
/** called after ram_bootstrap
 *  initializes the coremap
 *  excludes itself from the mapped pages
 *  @pre firstpaddr and lastpaddr have not been invalidated
 *  @post firstpaddr and lastpaddr are invalidated
 */
void vm_bootstrap() {

    paddr_t lastpaddr = ram_getsize();
    paddr_t firstpaddr = ram_getfirstfree();
    unsigned total_pages = (firstpaddr - lastpaddr)/PAGE_SZ;
    unsigned coremap->cm_sz = total_pages * PAGE_SZ/(PAGE_SZ + sizeof(cm_entry))

    // initialize coremap
    coremap->cm_elts = (cm_entry *)(firstaddr);
    coremap->clock_hand = 0;
    coremap->daemon_idx = 0;

    // for (i = 0; i < coremap->cm_sz; ++i)
    //     coremap->cm_elts[i] = {};
    memset(coremap->cm_elts, '\0', sizeof(cm_entry)*coremap->cm_sz);

    vm_initalized = true;
}

// swapdisk.h
/** called after vm_bootstrap
 *  initializes the swapdisk bitmap
 */

swapdisk_bitmap swapdisk_bitmap;

void swapdisk_bootstrap() {

    // use lhdraw0 disk as the swap space
    err = vfs_open("lhdraw0:"�, flags, 0644, &swapdisk_bitmap.vn);
    if (err)
        panic("swapdisk_bootstrap failed\n");

    // get the size of the disk
    struct stat s;
    VOP_STAT(swapdisk_bitmap->bm_vn, &s);
    off_t nbits = s.st_size;

    // allocate nbits for the bitmap
    swapdisk_bitmap->bm = bitmap_create(nbits);
    swapdisk_bitmap->bm_lk = lock_create(â€œbitmap lockâ€�);
    if (swapdisk_bitmap->bm_lkâ == NULL)
        panic("swapdisk_bootstrap failed\n");
}

// Helper functions for the
set_swaping_idx(unsigned pa_idx, unsigned swaping_idx, struct addrspace as);
unsigned get_swaping_idx(unsigned pa_idx, struct addrspace as);
bool has_swaping_idx(unsigned pa_idx, struct addrspace as);

/** Writes pages from the coremap to swapdisk periodically
  *
  */
void
vm_daemon_thread() {

    lock_acquire(swapdisk_bitmap->bm_lk)
    if IS_DIRTY(CME(coremaps->coremap.daemon_idx))

    unsigned sidx;

    if (!has_sidx(coremap.daemon_idx, procvector->procs[owner]->p_addrspace))
        bswaping_idx = swapdisk_find_freeidx();
    lock_acquire(procvector->procs[owner]->p_addrspace->pt->pt_lk);
    // update swaping idx in addrspace and swaping bit in coremap elt
    set_swaping_idx(coremap.daemon_idx, sing_idx, procvector->procs[owner]->p_addrspace);

    lock_release(procvector->procs[owner]->p_addrspace->pt->pt_lk);
    else
        swaping_idx =
    get_swaping_idx(coremap.daemon_idx,procvector->procs[owner]->p_addrspace)


    // TODO: do proper write with uio objects
    VOP_WRITE(swapdisk_bitmap->bm_vn, idx * PAGE_SZ);
    lock_release(swapdisk_bitmap->bm_lk)

    coremap->daemon_idx++ % coremap->sz;
}

`sys_sbrk`

int sys_sbrk(intptr_t amount, int *retval) {
  if (amount == 0) {
      // simple value return
        //*retval =  curproc->p_as->break
        return 0;
    }
    //TODO
    // get npages we need to add to our heap

    // Free or create pages as needed
    // if we cant find a new page, ENOMEM (retval gets 1)
  // if amount + current val < 0, EINVAL
}

Address Space API

/** Initializes addrspace
 */
struct addrspace *as_create(void) {
  // initialize pt_elets array at size USERPRAM_SZ/PAGE_SZ
  // create the pt_lk
}


/** Copies addrspaces
 *  @param[in] src
 *  @param[in] dst
 *  @return error
 */
int
as_copy(struct addrspace *src, struct addrspace **ret) {

    // grap the addrspace lock of the src
    // foreach multipage allocation in the page table of src
    // create a page in swapdisk for the new proc
    // if failure at any point
      // release src lock
        as_destroy(ret);
}

/** Gets addrespace and flushes current CPU's TLB
 *
 *  @return void
 */
void
as_activate(void)
{
  struct addrspace *as;

  as = curproc_getas();
  if (as == NULL) {
    /*
     * Kernel thread without an address space; leave the
     * prior address space in place.
     */
    return;
  }

  tlb_flush();
}

/** Deactivates current TLB entries
 *
 */
void  as_deactivate(void) {
}

/** Destroys addrspace
 *  We should free all proces pages that have not been
 *  freed in swap bitmap and the coremap.

 *  @param[in] src
 *  @param[in] dst
 *  @return void
 */
void as_destroy(struct addrspace *) {
    // for each element in the page table, destroy the element
    // free the corresponding swap space
    // free the core map
}

/** Set up a segment at virtual address VADDR of size MEMSIZE.
 *  The segment in memory extends from VADDR up to (but not
 *  including) VADDR+MEMSIZE.
 *
 *  The READABLE, WRITEABLE, and EXECUTABLE flags are set if
 *  read write, or execute permission should be set on the
 *  segment.
 *  @param[in] as addrespace
 *  @param[in] vaddr
 *  @param[in] sz
 *  @param[in] readable permissions
 *  @param[in] writeable permissions
 *  @param[in] executable permissions
 */
int as_define_region(struct addrspace *as,
                      vaddr_t vaddr, size_t sz,
                      int readable,
                      int writeable,
                      int executable) {
     // for every entry in the pagetable, set the write bits
     // do not set the

}

/** set the permissions of the addrspace table as read and write to prepare for load
 *  as_complete_load will restore permissions to whatever they were before
 *  @param[in] as addrespace
 *  @return error
 */
int as_prepare_load(struct addrspace *as) {

}

/** restore permissions to whatever they were before as_prepare_load
 *  @param[in] as addrespace
 *  @return error
 */
int as_complete_load(struct addrspace *as) {


}

/** Sets the top of the stack
 *  @param[in] as addrespace
 *  @param[in,out] initstackptr stack ptr to set
 *  @return error
 */
int as_define_stack(struct addrspace *as, vaddr_t *initstackptr) {

    *initstackptr = USERSTACK;
    return 0;
}

Kernel allocation

Coremap representation

Kernel allocations over multiple pages are labeled such that the first page has the big set, and subsequent pages of an allocation have the bit unset.

Consider the following kmalloc calls

...
kmalloc(3*4096);
kmalloc(1*4096);
kmalloc(2*4096);
kmalloc(1*4096);
...

Here are the multipage_start bits of the following contiguous physical pages from cm_elts[i].tlb_hi:multipage_start

... | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | ...

`kmalloc`

Do we alloc pages before setting up the VM? We shouldnâ€™t!!!

What if we have space in an existing page?
- Letâ€™s setup a ref count of objects for each page, we can free when its zero
- we should stuff pages when we can
Do we start a new page?
- Hopefully not, we want to save space

/** Gets npages from virtual memory
  * Never shifts pages around.
  * If there are not n contiguous pages in memory
  * we just return null.
  * @pre if vm_initialized, go through coremap, if not stealmem
  * @return null or the start of the start address
  */
static
paddr_t
getppages(unsigned long npages)
{
  paddr_t addr;

    if (!vm_initialized) {
      spinlock_acquire(&stealmem_lock);

      addr = ram_stealmem(npages);

      spinlock_release(&stealmem_lock);
    } else {
      // linear greedy search for contiguous free pages
        int pages_left = npages;
        int start_idx = 0;

        //TOOD: locking

        for (int i = 0; i < MAX_PPN; i++ && pages_left > 0; i++)     {
             if ( CME_ISFREE(CME(i)) ) {
                    pages_left--;
                    CME_RESERVE(CME(i));
            }
            else {
                pages_left = npages;
                start_idx = i;
            }
          }

        // we did not find enough pages
        if (pages_left > 0) {
          // ENOMEM?
            return NULL;
        }

        // setup page structure
        kalloc_startpage(start_idx);
        for (int i = start_idx+1; i < start_idx + npages; i++) {
        kalloc_multipage(i);
        }

        return ADDR_OF_PN(start_idx);

    }
  return addr;
}

`kfree`

How do we free multiple pages? Do we keep track of next in our core map?
- we know a multipage allocation is contiguous in virtual memory, we can therefore find out all the pages we need to free
Mark the page as free
- USE THIS: as_zero_region coremap_elt->address_space = NULL
do TLB shootdown

kfree is called on a page with the multipage flag unset, return error

Otherwise, kfree will free the current page and all following pages until the multipage flag is set again

void kfree(void *ptr) {
    if (ptr == NULL) {
            return;
    }
    unsigned int ppn = PN_OF_ADDR(ptr);
    cm_entry cme = CME(ppn);

    if (!CME_ISMULTIPAGE(cme)) {
        return EINVAL;
    }

    do {
        free_kpages(ppn);
    cme_deactivate(cme);
    ppn++;
    cme = CME(ppn);
    } while (CME_ISMULTIPAGE(cme) && ppn <= PPN_MAX);

}

Statistics and

Number of dirty pages
Number of unused pages in swapdisk
Number of kernel pages
Number of pages in memory
Number of pages in swapdisk

Performance Analysis

TLB Hit Rate = # TLB Hits / # TLB Accessses (should be ~98%)

"Let's page to the cloud"

https://www.dropbox.com/developers/core/docs#files_put

Other TODOs

error handling page after sbrk should be marked as invalid daemon only writes a page to disk if itâ€™s already there if itâ€™s not there just wait for forced eviction to put it there

when destroying the address space, we should delete from swap when we free we should delete from swap

Questions

TLB Questions

Problem 1 (5 points)

Assuming that a user program just accessed a piece of data at (virtual) address X, describe the conditions under which each of the following can arise. If the situation cannot happen, answer "impossible" and explain why it cannot occur.

TLB miss, page fault

user/kernel proc is running and the page requested is not in the TLB and the page is not in memory TLB miss, no page fault
user/kernel proc is running and the page is still in memory but not in the tlb TLB hit, page fault
impossible
if the valid bit is not set then it is not a hit TLB hit, no page fault
this is the best case, the page is in memory and mapped in the TLB

Problem 2 (2 points)

A friend of yours who foolishly decided not to take 161, but who likes OS/161, implemented a TLB that has room for only one entry, and experienced a bug that caused a user-level instruction to generate a TLB fault infinitelyâ€”the instruction never completed executing! Explain how this could happen. (Note that after OS/161 handles an exception, it restarts the instruction that caused the exception.)

We need to have a page in TLB
instruction pointer memory needs TLB too

la ($a0), $a1 // int a = *0x8000000;

Does kernel needs a TLB entry?

No all kernel memory is unmapped in our implmentation

Problem 3 (3 points)

How many memory-related exceptions (i.e., hardware exceptions and other software exceptional conditions) can the following MIPS-like instruction raise? Explain the cause of each exception.

load word from $0 (zero register), offset 0x120, into register $3

lw  $3, 0x0120($0)

Exceptions at the MIPS level

#define EX_TLBL 2 /* TLB miss on load */ (permissions)
#define EX_ADEL 4 /* Address error on load */

Exceptions at the OS161 level: segfault (accesses illegal memory)

Malloc Questions

Question 1. Consider the following (useless) program:

/* This is bad code: it doesn't do any error-checking */
#include <stdio.h>
int main (int argc, char **argv) {
    int i;
    void *start, *finish;
    void *res[10];
    start = sbrk(0);
    for (i = 0; i < 10; i++) {
        res[i] = malloc(10);
    }
    finish = sbrk(0);
    /* TWO */
    return 0;
}

How many times does the system call sbrk() get called from within malloc()?

Once, the first malloc gets the initial page, then other malloc calls all use the memory allocated by the first page

On the i386 platform, what is the numeric value of (finish - start)? 4 KB (0x1000)

Question 2. Suppose that we now insert the following code at location `/* TWO */`:

{
    void *x;
    free(res[8]); free(res[7]); free(res[6]);
    free(res[1]); free(res[3]); free(res[2]);
    x = malloc(60);  /* MARK */
}

Again on the i386, would malloc() call sbrk() when doing that last allocation at the marked line above? What can you say about x?

No, a page is 4KB, and we are using less than 100 bytes. We know that res[9] < x < sbrk(0)

Question 3. It is conventional for libc internal functions and variables to be prefaced with "__". Why do you think this is so?

If we use the "__" prefix we can directly access protected variables and write code that requires more privilege. It should be helpful to the programmer.

Question 4. The man page for malloc requires that "the pointer returned must be suitably aligned for use with any data type." How does our implementation of malloc guarantee this?

Currently, we assign a new page each time we call malloc, even to malloc 1 byte. Each malloc therefore therefore is 4KB aligned.

stylianipantela/a3-design.md

A3 Design Doc

Data Structures

Coremap

Page Table

Process-dependent structures

Swapdisk Bitmap

Macros and Interface prototypes

Synchronization Policy

Lock ordering

Coremap

Pagetable

Scenario

The Race

Process

Swap Data Structure

Spin Counting Semaphore

TLB Management

Context Switch

TLB Fault

Pagetable Management

Dynamic Pagetable management

Swap Management

Daemon

File IO

Bootstrapping

sys_sbrk

Address Space API

Kernel allocation

Coremap representation

kmalloc

kfree

Statistics and

Performance Analysis

"Let's page to the cloud"

Other TODOs

Questions

TLB Questions

Problem 1 (5 points)

Problem 2 (2 points)

Problem 3 (3 points)

Malloc Questions

Question 1. Consider the following (useless) program:

Question 2. Suppose that we now insert the following code at location /* TWO */:

Question 3. It is conventional for libc internal functions and variables to be prefaced with "__". Why do you think this is so?

Question 4. The man page for malloc requires that "the pointer returned must be suitably aligned for use with any data type." How does our implementation of malloc guarantee this?

`sys_sbrk`

`kmalloc`

`kfree`

Question 2. Suppose that we now insert the following code at location `/* TWO */`: