Unicode.Scalar.ASCII.md

Proposed solution

Add a Unicode.Scalar.ASCII type (another alternative naming/namespace is Unicode.ASCIIScalar). This type represents the ASCII compatible subset of Unicode.Scalar.

Until something like compile time constant expressions and compile time asserts (static asserts) lands in the language, the compiler would have a special range check for this type:

let s: Unicode.Scalar.ASCII = 'a' // OK
let s: Unicode.Scalar.ASCII = 'ÿ' // error: ÿ is not in the ASCII compatible subset of unicode

just as the compiler already has "magic" to check:

let s: Unicode.Scalar = "ÿa" // error: cannot convert value of type 'String' to specified type 'Unicode.Scalar'

Deprecations

Deprecate UInt8.init(ascii v: Unicode.Scalar), that is replaced by a new initializer UInt8.init(_ ascii: Unicode.Scalar.ASCII), similar to the already existing UInt32.init(_ v: Unicode.Scalar)

Example usage

Example usage, using similar use-cases as in the proposal: https://github.com/apple/swift-evolution/blob/master/proposals/0243-codepoint-and-character-literals.md

Example 1

 let hexadecimalScalars: [Unicode.Scalar.ASCII] = [
    '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 
    'a', 'b', 'c', 'd', 'e', 'f'
]

for scalar in hexadecimalScalars {
    switch scalar {
    case 'a' ... 'f':
        // lowercase hex letter
        let nibble = scalar.value - Unicode.Scalar.ASCII('a').value + 10
    case 'A' ... 'F':
        // uppercase hex letter
        let nibble = scalar.value - Unicode.Scalar.ASCII('A').value + 10
    case '0' ... '9':
        // hex digit
        let nibble = scalar.value - Unicode.Scalar.ASCII('0').value
    default:
        // something else
    }
}

Example 2

let byteBuffer: [UInt8] = [97, 98, 99, 100, 101, 102, 64] // you would not write these byte values here, they would come from a file, network or whatever

for byte in byteBuffer {
    switch byte {
    case 'a'...'f':
       // lowercase hex letter
    case '@':
       // '@' letter
       
    // Too verbose, enable the above by implementing pattern matching operator
    // case Unicode.Scalar.ASCII('a').value...Unicode.Scalar.ASCII('f').value: 
    // case Unicode.Scalar.ASCII('a').value
    }
}

Example 3

There is another type that conforms to ExpressibleByUnicodeScalarLiteral and defines a match against UInt8:

struct Foo: ExpressibleByUnicodeScalarLiteral {
    init(unicodeScalarLiteral: Unicode.Scalar) {
       ...
    }
    
    static func ~= (pattern: Foo, value: UInt8) -> Bool {
        ... 
    }
}

let byteBuffer: [UInt8] = [64]
for byte in byteBuffer {
   case '@':   // Error: ambiguous use of operator '~='
      ...
   case Unicode.Scalar.ASCII('@'):   // OK
      ... 
}

Implementation for trying out in todays Swift

Replacing single quotes ' by " in the above examples, the following sample implementation allows playing with them in Swift 4.2/5:

extension Unicode.Scalar {
    
    struct ASCII {
        let value: UInt8
    }
    
}
    
extension Unicode.Scalar.ASCII {
    init(_ unicodeScalar: Unicode.Scalar) {
        guard unicodeScalar.isASCII else {
            preconditionFailure()
        }
            
        value = UInt8(unicodeScalar.value)
    }
}

extension Unicode.Scalar.ASCII: ExpressibleByUnicodeScalarLiteral {
    init(unicodeScalarLiteral: Unicode.Scalar) {
        self.value = UInt8(ascii: unicodeScalarLiteral)
    }
}

extension Unicode.Scalar.ASCII: Strideable {
    func advanced(by n: Int8) -> Unicode.Scalar.ASCII {
        return Unicode.Scalar.ASCII(value: UInt8(Int8(self.value) + n))
    }
    
    func distance(to other: Unicode.Scalar.ASCII) -> Int8 {
        return Int8(other.value) - Int8(self.value)
    }
}

extension Unicode.Scalar.ASCII {
    static func ~= (pattern: Unicode.Scalar.ASCII, value: UInt8) -> Bool {
        return pattern.value == value
    }
}

extension ClosedRange where Bound == Unicode.Scalar.ASCII {
    static func ~= (pattern: ClosedRange<Unicode.Scalar.ASCII>, value: UInt8) -> Bool {
        return (pattern.lowerBound.value..<pattern.upperBound.value).contains(value)
    }
}

bobergj/Unicode.Scalar.ASCII.md