David Palmer cloventt

I just spent several hours trying to configure a psuedo-distributed Hadoop cluster inside a Docker container. I wanted to post our experience in case someone else makes the mistake of trying to do this themselves.

Problem

When we tried to save a file to HDFS with the Java client, the NameNode appeared to save the file. Using hdfs dfs ls we could see the file was represented in HDFS, but has a size of 0, indicating no data had made it into the cluster.

A really unhelpful stack trace was also issues by the client. The error was similar to this:

org.apache.hadoop.ipc.RemoteException(java.io.IOException):
    File foo.csv could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) excluded in this operation.

This is just the stacktrace from Hadoop piped back to the client.

In anticipation of my shiny new Xbox One X showing up I began buying and ripping 4K UHD Bluray disks, so my Plex media server would have some content on it that could take advantage of my new hardware.

I was pretty dissapointed when I discovered none of it worked very well for me at all. After a lot of trial and error I think I've worked out the situation of 4K media on Xbox One, and how to get your library ready to handle it.

The Setup

My problem is my televsion does not support 10 bit HDR video. I bought it at a very steep discount, and while its a great simple 4K UHD TV, it does not support 10 bit HDR video. The exact model is a Dish TV NZ 55" UHD TV.

I serve my files via Plex from a local server, over gigibit ethernet. On the Xbox I normally use the DLNA option in the Media Player app, but I have also experimented with using the official Plex Xbox One app. I decided I hate using this after some experimenting, and I'll explain

A coworker of mine accidentally deleted a large S3 prefix containing tens of thousands of keys. Luckily we have versioning enabled on the bucket, or it would have been gone for good. With versioning enabled, recovering a deleted file from S3 is trivial by just deleting the Delete Marker. But this presented the next challenge: how to perform a bulk undelete of deleted keys in S3.

I found a few scripts online that claimed to do this, but they either performed a delete operation per-object (incredibly slow), or they didn't deal with the S3 API limit of 1000 keys per delete operation, or they tried to pipe a massive blob of JSON to the awscli in Bash, which has a limit of 128KiB on arguments lists.

OpenRefine geoJSON Column to Coordinates

I had a dataset in OpenRefine that included a column of geoJSON objects as strings. Each geoJSON object represented a public park. I wanted to import the collection of parks to Wikidata, so I needed to convert the geoJSON polygon data into a central "point" that could be put into the P625 coordinate location field.

NB: your geoJSON should have coordinates already in the WGS84 coordinate system (this is the standard for geoJSON).

To use this in OpenRefine:

Click the down-arrow at the top of your raw geoJSON column.
Select Edit Column -> Add column based on this column...

	#EXTM3U
	#EXT-X-VERSION:4
	#EXT-X-INDEPENDENT-SEGMENTS
	#EXTINF:-1 tvg-logo="https://upload.wikimedia.org/wikipedia/commons/thumb/5/56/Das_Erste-Logo_klein.svg/480px-Das_Erste-Logo_klein.svg.png",Das Erste (DE)
	https://mcdn.daserste.de/daserste/de/profile1/index.m3u8
	#EXTINF:-1 tvg-logo="https://upload.wikimedia.org/wikipedia/commons/thumb/0/08/NDR_Dachmarke.svg/986px-NDR_Dachmarke.svg.png",NDR (DE)
	https://ndrint.akamaized.net/hls/live/2020766/ndr_int/master.m3u8
	#EXTINF:-1 tvg-logo="https://upload.wikimedia.org/wikipedia/commons/thumb/8/83/SR_Dachmarke.svg/627px-SR_Dachmarke.svg.png",SR (DE)
	https://srfs.akamaized.net/hls/live/689649/srfsgeo/index.m3u8
	#EXTINF:-1 tvg-logo="https://upload.wikimedia.org/wikipedia/commons/thumb/6/69/Deutsche_Welle_Logo.svg/640px-Deutsche_Welle_Logo.svg.png",DW

	ABSTRACT
	The current threat of myrtle rust (Austropuccinia psidii) to New Zealand
	Myrtaceae, including a number of indigenous and socio-economically
	important species, requires that ex situ conservation is used to
	complement in situ populations. New Zealand’s Myrtaceae have
	received little attention in terms of ex situ conservation. In this study,
	we assessed the integrated ex situ conservation strategies for
	selected New Zealand Myrtaceae. We particularly investigated seed
	banking options by assessing seed desiccation tolerance, in vitro
	culture, pollen cryopreservation and zygotic embryo