Skip to content

Instantly share code, notes, and snippets.

@antimon2
Last active December 18, 2022 03:29
Show Gist options
  • Save antimon2/0636c79fbdbcb322b6f4dac3f32c0616 to your computer and use it in GitHub Desktop.
Save antimon2/0636c79fbdbcb322b6f4dac3f32c0616 to your computer and use it in GitHub Desktop.
細かすぎて伝わらないかもしれないJuliaのTips.jl.ipynb
[compat]
julia = "1.7"
[deps]
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
{
"cells": [
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.942Z",
"end_time": "2022-12-18T12:25:37.872000+09:00"
},
"trusted": true
},
"id": "e5f3c20a",
"cell_type": "code",
"source": "versioninfo()",
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"text": "Julia Version 1.8.3\nCommit 0434deb161e (2022-11-14 20:14 UTC)\nPlatform Info:\n OS: Linux (x86_64-linux-gnu)\n CPU: 12 × Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz\n WORD_SIZE: 64\n LIBM: libopenlibm\n LLVM: libLLVM-13.0.1 (ORCJIT, skylake)\n Threads: 1 on 12 virtual cores\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"id": "b5df686f",
"cell_type": "markdown",
"source": "## `findXXX()` 系関数"
},
{
"metadata": {},
"id": "ea0d95cf",
"cell_type": "markdown",
"source": "疑問:\n\n+ 第1引数の関数(Do ブロック)に渡ってくる引数はコレクションの要素、戻り値はインデックスなのはなぜ?\n+ (1つ1つの)要素だけじゃなくて、複数の要素とかインデックスも参考にしながら条件判定したいんだけど…\n+ 例えば「同じ要素が3つ続く箇所を見つけてそのインデックスの範囲を返す」はどう実現すれば良い?"
},
{
"metadata": {},
"id": "f6cc7741",
"cell_type": "markdown",
"source": "### 解決法1:`for` 文で回せばOK"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.947Z",
"end_time": "2022-12-18T12:25:38.322000+09:00"
},
"trusted": true
},
"id": "1b762e25",
"cell_type": "code",
"source": "# 同じ要素がN個続く箇所を見つけてそのインデックスの範囲を返す\nfunction findNrepeats(a::AbstractVector, N)\n _lastindex = lastindex(a)\n for i in eachindex(a) # keys(a) でもOK\n r = i:i+N-1\n last(r) > _lastindex && break\n allequal(a[r]) && return r\n end\n nothing\nend",
"execution_count": 2,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 2,
"data": {
"text/plain": "findNrepeats (generic function with 1 method)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "624bdb77",
"cell_type": "markdown",
"source": "※ Julia の `findXXX()` 系の関数は「条件に合致するもののインデックス(の範囲)を返す関数(存在しなければ `nothing`)」という共通仕様がある(例外:`findmax()`/`findmin()`)ので、それに合わせた仕様にしてあります。"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "※ あと「コレクションの要素が全て同じかどうか」を判定するズバリ `allequal()` て関数が標準であります。便利♪知ってました?"
},
{
"metadata": {},
"id": "24ee21bf",
"cell_type": "markdown",
"source": "※ Julia のインデックスは(標準の1次元配列なら)1-originだけど、 \n  全てのコレクションのインデックスがそうとは限らない(設定次第)ので \n  `for i in 1:length(a)` のように書くのは危険!"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.952Z",
"end_time": "2022-12-18T12:25:38.549000+09:00"
},
"trusted": true
},
"id": "d60cc90e",
"cell_type": "code",
"source": "a = [0, 7, 7, 5, 0, 0, 0, 1, 1, 2]; # a[5:7] == [0, 0, 0]\nfindNrepeats(a, 3)",
"execution_count": 3,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 3,
"data": {
"text/plain": "5:7"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.955Z",
"end_time": "2022-12-18T12:25:38.739000+09:00"
},
"trusted": true
},
"id": "9060fad6",
"cell_type": "code",
"source": "findNrepeats([3, 1, 4, 1, 5, 9, 2, 6, 5, 3], 2) === nothing",
"execution_count": 4,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 4,
"data": {
"text/plain": "true"
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "db89281e",
"cell_type": "markdown",
"source": "### 解決法2:無理矢理 `findXXX()` を利用する形に落とし込む"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.960Z",
"end_time": "2022-12-18T12:25:38.815000+09:00"
},
"trusted": true
},
"id": "71ae3f51",
"cell_type": "code",
"source": "# 同じ要素がN個続く箇所を見つけてそのインデックスの範囲を返す ver.2\nfunction findNrepeats_v2(a::AbstractVector, N)\n rs = [i:i+N-1 for i in eachindex(a) if i+N-1 ≤ lastindex(a)]\n index = findfirst(rs) do r\n allequal(a[r])\n end\n isnothing(index) ? nothing : rs[index]\nend",
"execution_count": 5,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 5,
"data": {
"text/plain": "findNrepeats_v2 (generic function with 1 method)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.964Z",
"end_time": "2022-12-18T12:25:38.862000+09:00"
},
"trusted": true
},
"id": "014e1774",
"cell_type": "code",
"source": "a = [0, 7, 7, 5, 0, 0, 0, 1, 1, 2]; # a[5:7] == [0, 0, 0]\nfindNrepeats_v2(a, 3)",
"execution_count": 6,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 6,
"data": {
"text/plain": "5:7"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.969Z",
"end_time": "2022-12-18T12:25:38.863000+09:00"
},
"trusted": true
},
"id": "62619987",
"cell_type": "code",
"source": "findNrepeats_v2([3, 1, 4, 1, 5, 9, 2, 6, 5, 3], 2) === nothing",
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 7,
"data": {
"text/plain": "true"
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "8ad293f2",
"cell_type": "markdown",
"source": "### ベンチマーク"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.972Z",
"end_time": "2022-12-18T12:25:38.866000+09:00"
},
"trusted": true
},
"id": "8c347ea4",
"cell_type": "code",
"source": "using BenchmarkTools, Random",
"execution_count": 8,
"outputs": []
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.976Z",
"end_time": "2022-12-18T12:25:48.400000+09:00"
},
"trusted": true
},
"id": "7327ebbb",
"cell_type": "code",
"source": "N=3\nRandom.seed!(1234)\n@benchmark findNrepeats(a, $N) setup=(a=rand(0:9, 100))",
"execution_count": 9,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 9,
"data": {
"text/plain": "BenchmarkTools.Trial: 2112 samples with 992 evaluations.\n Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m38.025 ns\u001b[22m\u001b[39m … \u001b[35m7.615 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 48.66%\n Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m 2.642 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m 2.338 μs\u001b[22m\u001b[39m ± \u001b[32m1.352 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m6.92% ± 11.73%\n\n \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[34m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\u001b[39m█\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n \u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▂\u001b[32m▂\u001b[39m\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[34m▂\u001b[39m\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m▂\n 38 ns\u001b[90m Histogram: frequency by time\u001b[39m 5.27 μs \u001b[0m\u001b[1m<\u001b[22m\n\n Memory estimate\u001b[90m: \u001b[39m\u001b[33m80 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m1\u001b[39m."
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.980Z",
"end_time": "2022-12-18T12:25:51.630000+09:00"
},
"trusted": true
},
"id": "b84c20f2",
"cell_type": "code",
"source": "N=3\nRandom.seed!(1234)\n@benchmark findNrepeats_v2(a, $N) setup=(a=rand(0:9, 100))",
"execution_count": 10,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 10,
"data": {
"text/plain": "BenchmarkTools.Trial: 10000 samples with 13 evaluations.\n Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m942.231 ns\u001b[22m\u001b[39m … \u001b[35m344.311 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m 0.00% … 97.08%\n Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m 3.812 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m 0.00%\n Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m 4.033 μs\u001b[22m\u001b[39m ± \u001b[32m 11.516 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m13.81% ± 4.84%\n\n \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[34m \u001b[39m\u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m█\u001b[39m▃\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n \u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[34m▂\u001b[39m\u001b[39m▂\u001b[32m▂\u001b[39m\u001b[39m▄\u001b[39m█\u001b[39m█\u001b[39m▄\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m▂\n 942 ns\u001b[90m Histogram: frequency by time\u001b[39m 7.74 μs \u001b[0m\u001b[1m<\u001b[22m\n\n Memory estimate\u001b[90m: \u001b[39m\u001b[33m3.81 KiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m6\u001b[39m."
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "80881c4a",
"cell_type": "markdown",
"source": "※ 結局 `for` で回した方が高パフォーマンス!"
},
{
"metadata": {},
"id": "3523eb8a",
"cell_type": "markdown",
"source": "### 参考:文字列なら正規表現+`findXXX()` でOK"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.984Z",
"end_time": "2022-12-18T12:25:51.719000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "# 前準備:「同じ文字のN回繰り返し」という正規表現を生成しキャッシュする仕組み\n@generated function getNrepeatsRegex(::Val{N}) where {N}\n Regex(\"(.)\" * \"\\\\1\" ^ (N-1)) # N=3 なら `r\"(.)\\1\\1\"`\nend",
"execution_count": 11,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 11,
"data": {
"text/plain": "getNrepeatsRegex (generic function with 1 method)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.987Z",
"end_time": "2022-12-18T12:25:51.720000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "# 同じ文字がN個続く箇所を見つけてそのインデックスの範囲を返す\nfunction findNrepeats(s::AbstractString, N)\n # rex = Regex(\"(.)\" * \"\\\\1\" ^ (N-1)) # N=3 なら `r\"(.)\\1\\1\"`\n rex = getNrepeatsRegex(Val(N))\n findfirst(rex, s)\nend",
"execution_count": 12,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 12,
"data": {
"text/plain": "findNrepeats (generic function with 2 methods)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.990Z",
"end_time": "2022-12-18T12:25:51.734000+09:00"
},
"trusted": true
},
"id": "6e1e453b",
"cell_type": "code",
"source": "s = \"ABC123あああ😁漢字\" # s[7:13] == \"あああ\"\nfindNrepeats(s, 3)",
"execution_count": 13,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 13,
"data": {
"text/plain": "7:13"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### 参考の参考:文字列なら正規表現+`findXXX()` でOK(キャッシュしないバージョン)"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.994Z",
"end_time": "2022-12-18T12:25:51.735000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "# 同じ文字がN個続く箇所を見つけてそのインデックスの範囲を返す ver.2\nfunction findNrepeats_v2(s::AbstractString, N)\n rex = Regex(\"(.)\" * \"\\\\1\" ^ (N-1)) # N=3 なら `r\"(.)\\1\\1\"`\n # rex = getNrepeatsRegex(Val(N))\n findfirst(rex, s)\nend",
"execution_count": 14,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 14,
"data": {
"text/plain": "findNrepeats_v2 (generic function with 2 methods)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:36.998Z",
"end_time": "2022-12-18T12:25:51.742000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "s = \"ABC123あああ😁漢字\" # s[7:13] == \"あああ\"\nfindNrepeats_v2(s, 3)",
"execution_count": 15,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 15,
"data": {
"text/plain": "7:13"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.002Z",
"end_time": "2022-12-18T12:25:56.039000+09:00"
},
"trusted": true
},
"id": "cab8a30b",
"cell_type": "code",
"source": "Random.seed!(1234)\nN = 3\n@benchmark findNrepeats(s, $N) setup=(s=randstring(\"123ABCあいう😁漢字\", 100))",
"execution_count": 16,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 16,
"data": {
"text/plain": "BenchmarkTools.Trial: 10000 samples with 372 evaluations.\n Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m250.341 ns\u001b[22m\u001b[39m … \u001b[35m 2.323 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m982.132 ns \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m859.185 ns\u001b[22m\u001b[39m ± \u001b[32m285.960 ns\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n\n \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[34m▅\u001b[39m\u001b[39m█\u001b[39m▃\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n \u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[32m▂\u001b[39m\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▇\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m▅\u001b[39m▃\u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m▂\n 250 ns\u001b[90m Histogram: frequency by time\u001b[39m 1.56 μs \u001b[0m\u001b[1m<\u001b[22m\n\n Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.005Z",
"end_time": "2022-12-18T12:26:02.605000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "Random.seed!(1234)\nN = 3\n@benchmark findNrepeats_v2(s, $N) setup=(s=randstring(\"123ABCあいう😁漢字\", 100))",
"execution_count": 17,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 17,
"data": {
"text/plain": "BenchmarkTools.Trial: 10000 samples with 5 evaluations.\n Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m 6.111 μs\u001b[22m\u001b[39m … \u001b[35m 1.870 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m15.038 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m15.457 μs\u001b[22m\u001b[39m ± \u001b[32m19.184 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n\n \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▅\u001b[39m▆\u001b[39m▃\u001b[39m▅\u001b[39m▆\u001b[39m▄\u001b[39m▆\u001b[39m▇\u001b[39m▅\u001b[39m▅\u001b[39m▅\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m█\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▇\u001b[34m▆\u001b[39m\u001b[32m▅\u001b[39m\u001b[39m▅\u001b[39m▆\u001b[39m▆\u001b[39m▄\u001b[39m▇\u001b[39m▅\u001b[39m▆\u001b[39m▅\u001b[39m▄\u001b[39m▆\u001b[39m▆\u001b[39m▅\u001b[39m▃\u001b[39m▃\u001b[39m▄\u001b[39m▄\u001b[39m▂\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n \u001b[39m▂\u001b[39m▄\u001b[39m▅\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▄\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m▆\n 6.11 μs\u001b[90m Histogram: frequency by time\u001b[39m 28.9 μs \u001b[0m\u001b[1m<\u001b[22m\n\n Memory estimate\u001b[90m: \u001b[39m\u001b[33m80 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m3\u001b[39m."
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "c69de7f1",
"cell_type": "markdown",
"source": "※ `findXXX` は文字列検索の場合第1引数は関数だけではなくパターン(正規表現、文字の範囲等)もOK! \n※ ただし毎回正規表現を生成するのは意外とコスト高い(=遅い)ので何らかの方法でキャッシュする必要あり(正規表現リテラルが使用できるなら使用するなど) \n※ てかこのサンプルを作ることで `@generated`(生成関数)の使い途を1つ見つけてしまった♪"
},
{
"metadata": {},
"id": "9633727c",
"cell_type": "markdown",
"source": "### 参考:`findXXX()`(第1引数に関数を受け取る使い方)はなんで「値で検索」結果はインデックスなのか?"
},
{
"metadata": {},
"id": "a7285092",
"cell_type": "markdown",
"source": "→ 配列や文字列だけじゃなくて、辞書(`Dict`)や名前付きタプル(`NamedTuple`)にも同じI/Fで利用できるから(それらの場合キー(の列)が返る)。"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.010Z",
"end_time": "2022-12-18T12:26:03.621000+09:00"
},
"trusted": true
},
"id": "b69bbb75",
"cell_type": "code",
"source": "D = Dict(name=>value for (value, name) in enumerate([\"Carol\", \"Alice\", \"Ellen\", \"Bob\", \"Dave\"]))\n@show findall(isodd, D); # 値が奇数であるようなエントリーのキーを全て列挙\n@show [D[key] for key in findall(isodd, D)] # 結果的に `filter(isodd, collect(values(D)))` と同じ",
"execution_count": 18,
"outputs": [
{
"output_type": "stream",
"text": "findall(isodd, D) = [\"Carol\", \"Dave\", \"Ellen\"]\n[D[key] for key = findall(isodd, D)] = [1, 5, 3]\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"execution_count": 18,
"data": {
"text/plain": "3-element Vector{Int64}:\n 1\n 5\n 3"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## 続・`findXXX()` 系関数"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "疑問:\n\n+ `findXXX()` 系関数の理念や使い途は分かった。分かったけれどやっぱ使いにくい!\n 1. インデックスやキーで参照できるコレクションしか扱えない\n 2. 戻り値がインデックスやキーなのが分かりにくい\n+ Julia に「一般のイテレータで使える」「条件に合致する最初の要素を取得(なければ `nothing`)」っていう関数はないの?"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### 解決法1:`Iterators.dropwhile()` と `Base.first()` を組み合わせればOK"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.014Z",
"end_time": "2022-12-18T12:26:03.706000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "# 第1引数で条件判定して、第2引数のコレクションで最初に合致する要素を返す(なければ `nothing`)\nfunction meetfirst_v1(pred::Function, itr)\n # NG: `return first(Iterators.dropwhile(!pred, itr))`\n itr2 = Iterators.dropwhile(!pred, itr)\n try\n first(itr2)\n catch e\n if isa(e, BoundsError) || isa(e, ArgumentError)\n return nothing\n end\n rethrow(e)\n end\nend",
"execution_count": 19,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 19,
"data": {
"text/plain": "meetfirst_v1 (generic function with 1 method)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.017Z",
"end_time": "2022-12-18T12:26:03.706000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "a = [314, 159, 265, 358, 979, 323, 846, 264];",
"execution_count": 20,
"outputs": []
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.021Z",
"end_time": "2022-12-18T12:26:03.900000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "meetfirst_v1(n -> n % 11 == 0, a)",
"execution_count": 21,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 21,
"data": {
"text/plain": "979"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.024Z",
"end_time": "2022-12-18T12:26:03.917000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "meetfirst_v1(n -> n % 7 == 0, a) === nothing",
"execution_count": 22,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 22,
"data": {
"text/plain": "true"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.027Z",
"end_time": "2022-12-18T12:26:04.001000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "# コラッツ数列を列挙するイテレータ(`Channel`)を返す関数\nfunction collatz(n::Int)\n Channel{Int}() do chnl\n put!(chnl, n)\n while n > 1\n n = iseven(n) ? n ÷ 2 : 3n + 1\n put!(chnl, n)\n end\n end\nend",
"execution_count": 23,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 23,
"data": {
"text/plain": "collatz (generic function with 1 method)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.031Z",
"end_time": "2022-12-18T12:26:04.137000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "meetfirst_v1(≥(200), collatz(27))",
"execution_count": 24,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 24,
"data": {
"text/plain": "214"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.035Z",
"end_time": "2022-12-18T12:26:04.195000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "# 参考\nlet c27=collect(collatz(27))\n c27[findfirst(≥(200), c27)]\nend",
"execution_count": 25,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 25,
"data": {
"text/plain": "214"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.038Z",
"end_time": "2022-12-18T12:26:04.196000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "meetfirst_v1(≥(50), collatz(3)) === nothing",
"execution_count": 26,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 26,
"data": {
"text/plain": "true"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "※ 基本的な発想は `first(Iterators.dropwhile(!pred, itr))` \n※ `!pred` というのは、`pred()` 関数の結果(`Bool`値 という前提)を否定、つまり `x->!(pred(x))` 相当 \n※ `first(《空のイテレータ》)` は例外が発生してしまうので適切な対処が必要"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### 解決法2:パフォーマンス気にしてみる"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.043Z",
"end_time": "2022-12-18T12:26:04.271000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "# 第1引数で条件判定して、第2引数のコレクションで最初に合致する要素を返す(なければ `nothing`) ver.2\nfunction meetfirst(pred::Function, itr)\n itr2 = Iterators.dropwhile(!pred, itr)\n next = iterate(itr2)\n isnothing(next) ? nothing : first(next) # `first(next)` はタプルの第1要素を取得している\nend",
"execution_count": 27,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 27,
"data": {
"text/plain": "meetfirst (generic function with 1 method)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.046Z",
"end_time": "2022-12-18T12:26:04.272000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "a = [314, 159, 265, 358, 979, 323, 846, 264];",
"execution_count": 28,
"outputs": []
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.049Z",
"end_time": "2022-12-18T12:26:04.289000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "meetfirst(n -> n % 11 == 0, a)",
"execution_count": 29,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 29,
"data": {
"text/plain": "979"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.052Z",
"end_time": "2022-12-18T12:26:04.301000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "meetfirst(n -> n % 7 == 0, a) === nothing",
"execution_count": 30,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 30,
"data": {
"text/plain": "true"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.055Z",
"end_time": "2022-12-18T12:26:04.316000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "meetfirst(≥(200), collatz(27))",
"execution_count": 31,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 31,
"data": {
"text/plain": "214"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.058Z",
"end_time": "2022-12-18T12:26:04.317000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "meetfirst(≥(50), collatz(3)) === nothing",
"execution_count": 32,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 32,
"data": {
"text/plain": "true"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "※ Julia のイテレーションの仕組み(`iterate()` 関数の使い方・戻り値)を知っていれば、`first()` 関数相当のことを簡単に実現できてしかも例外処理不要! \n※ (簡単におさらい) `iterate(itr)` は、最初の要素があるときは `(《要素》, 《状態オブジェクト》)`、なければ `nothing`"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### ベンチマーク"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.063Z",
"end_time": "2022-12-18T12:26:15.033000+09:00"
},
"trusted": true
},
"cell_type": "code",
"source": "Random.seed!(1234)\n@benchmark meetfirst_v1(≥(200), c) setup=(c=collatz(rand(3:100)))",
"execution_count": 33,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 33,
"data": {
"text/plain": "BenchmarkTools.Trial: 3885 samples with 9 evaluations.\n Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m 2.262 μs\u001b[22m\u001b[39m … \u001b[35m226.690 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m167.101 μs \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m138.386 μs\u001b[22m\u001b[39m ± \u001b[32m 62.275 μs\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n\n \u001b[39m▃\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m█\u001b[34m▃\u001b[39m\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n \u001b[39m█\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[32m▂\u001b[39m\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▆\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m▅\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m \u001b[39m▃\n 2.26 μs\u001b[90m Histogram: frequency by time\u001b[39m 197 μs \u001b[0m\u001b[1m<\u001b[22m\n\n Memory estimate\u001b[90m: \u001b[39m\u001b[33m48 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m2\u001b[39m."
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true,
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.067Z",
"end_time": "2022-12-18T12:26:18.139000+09:00"
}
},
"cell_type": "code",
"source": "Random.seed!(1234)\n@benchmark meetfirst(≥(200), c) setup=(c=collatz(rand(3:100)))",
"execution_count": 34,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 34,
"data": {
"text/plain": "BenchmarkTools.Trial: 10000 samples with 746 evaluations.\n Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m212.961 ns\u001b[22m\u001b[39m … \u001b[35m800.534 ns\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m258.094 ns \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m295.128 ns\u001b[22m\u001b[39m ± \u001b[32m 87.819 ns\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m0.00% ± 0.00%\n\n \u001b[39m \u001b[39m \u001b[39m▄\u001b[39m▅\u001b[39m▇\u001b[39m█\u001b[39m▆\u001b[34m▅\u001b[39m\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n \u001b[39m▄\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m▇\u001b[39m▇\u001b[39m▄\u001b[39m▃\u001b[32m▂\u001b[39m\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▂\u001b[39m▃\u001b[39m▄\u001b[39m▆\u001b[39m▄\u001b[39m▄\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m▂\n 213 ns\u001b[90m Histogram: frequency by time\u001b[39m 611 ns \u001b[0m\u001b[1m<\u001b[22m\n\n Memory estimate\u001b[90m: \u001b[39m\u001b[33m0 bytes\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m0\u001b[39m."
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "※ Julia の例外処理は意外と重い!"
},
{
"metadata": {},
"id": "dca002c2",
"cell_type": "markdown",
"source": "## 文字列の折り返し"
},
{
"metadata": {},
"id": "fbb79ebb",
"cell_type": "markdown",
"source": "疑問:\n\n+ Julia はフリーインデントだから、長い配列を直書きするときは適宜改行して書けるから良いよね。\n+ でも長い文字列はずらーっと横に長くなっちゃうよね…\n+ だって Julia の文字列は中に改行を入れられる=途中で改行すると改行文字になっちゃうじゃん…"
},
{
"metadata": {},
"id": "8a840072",
"cell_type": "markdown",
"source": "### 解決:行末に `\\` を入れることで折り返しできる(ようになった)よ!(≥v1.7)"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.071Z",
"end_time": "2022-12-18T12:26:18.586000+09:00"
},
"trusted": true
},
"id": "f3520cc8",
"cell_type": "code",
"source": "jgm0 = \"寿限無寿限無五劫の擦り切れ海砂利水魚の水行末雲来末風来末食う寝る処に住む処やぶらこうじのぶらこうじパイポ・パイポ・パイポのシューリンガン・シューリンガンのグーリンダイ・グーリンダイのポンポコピーのポンポコナの長久命の長助\"",
"execution_count": 35,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 35,
"data": {
"text/plain": "\"寿限無寿限無五劫の擦り切れ海砂利水魚の水行末雲来末風来末食う寝る処に住む処やぶらこうじのぶらこうじパイポ・パイポ・パイポのシューリンガン・シューリンガンのグーリンダイ・グーリンダイのポンポコピーのポンポコナの長久命の長助\""
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.074Z",
"end_time": "2022-12-18T12:26:18.587000+09:00"
},
"trusted": true
},
"id": "006ac06d",
"cell_type": "code",
"source": "jgm1 = \"寿限無寿限無五劫の擦り切れ\\\n海砂利水魚の水行末雲来末風来末\\\n食う寝る処に住む処やぶらこうじのぶらこうじ\\\nパイポ・パイポ・パイポのシューリンガン・シューリンガンのグーリンダイ・グーリンダイのポンポコピーのポンポコナの\\\n長久命の長助\"",
"execution_count": 36,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 36,
"data": {
"text/plain": "\"寿限無寿限無五劫の擦り切れ海砂利水魚の水行末雲来末風来末食う寝る処に住む処やぶらこうじのぶらこうじパイポ・パイポ・パイポのシューリンガン・シューリンガンのグーリンダイ・グーリンダイのポンポコピーのポンポコナの長久命の長助\""
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.076Z",
"end_time": "2022-12-18T12:26:18.587000+09:00"
},
"trusted": true
},
"id": "1309be74",
"cell_type": "code",
"source": "jgm2 = \"\"\"\n 寿限無寿限無五劫の擦り切れ海砂利水魚の水行末雲来末風来末食う寝る処に住む処やぶらこうじのぶらこうじ\\\n パイポ・パイポ・パイポのシューリンガン・シューリンガンのグーリンダイ・グーリンダイのポンポコピーのポンポコナの\\\n 長久命の長助\"\"\"\n# ↑ `\"\"\"~\"\"\"` ならインデントも無視されるのでさらに見やすく!",
"execution_count": 37,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 37,
"data": {
"text/plain": "\"寿限無寿限無五劫の擦り切れ海砂利水魚の水行末雲来末風来末食う寝る処に住む処やぶらこうじのぶらこうじパイポ・パイポ・パイポのシューリンガン・シューリンガンのグーリンダイ・グーリンダイのポンポコピーのポンポコナの長久命の長助\""
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.079Z",
"end_time": "2022-12-18T12:26:18.588000+09:00"
},
"trusted": true
},
"id": "cbec887a",
"cell_type": "code",
"source": "jgm0 == jgm1 == jgm2 # もちろん文字列として同一視できる",
"execution_count": 38,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 38,
"data": {
"text/plain": "true"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.081Z",
"end_time": "2022-12-18T12:26:18.589000+09:00"
},
"trusted": true
},
"id": "86f801bb",
"cell_type": "code",
"source": "jgm0 === jgm1 === jgm2 # リテラルとして同一になる(今回の場合)ので `===` で比較しても `true` になる!",
"execution_count": 39,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 39,
"data": {
"text/plain": "true"
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "374a4919",
"cell_type": "markdown",
"source": "### 参考:<v1.7 なら取り敢えず `\"~\" * \"~\"` で…"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.085Z",
"end_time": "2022-12-18T12:26:18.590000+09:00"
},
"trusted": true
},
"id": "780e3607",
"cell_type": "code",
"source": "jgm_oldstyle = \n \"寿限無寿限無五劫の擦り切れ海砂利水魚の水行末雲来末風来末食う寝る処に住む処やぶらこうじのぶらこうじ\" *\n \"パイポ・パイポ・パイポのシューリンガン・シューリンガンのグーリンダイ・グーリンダイのポンポコピーのポンポコナの\" *\n \"長久命の長助\"",
"execution_count": 40,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 40,
"data": {
"text/plain": "\"寿限無寿限無五劫の擦り切れ海砂利水魚の水行末雲来末風来末食う寝る処に住む処やぶらこうじのぶらこうじパイポ・パイポ・パイポのシューリンガン・シューリンガンのグーリンダイ・グーリンダイのポンポコピーのポンポコナの長久命の長助\""
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "58efd77e",
"cell_type": "markdown",
"source": "※ 最適化の関係でこれくらいの長さの文字列ならたぶん `jgm0 === jgm_oldstyle` になります。。。"
},
{
"metadata": {},
"id": "9637c950",
"cell_type": "markdown",
"source": "## `OrderedDict`/`SortedSet`"
},
{
"metadata": {},
"id": "48a617cc",
"cell_type": "markdown",
"source": "疑問:\n\n+ Julia の `Dict` や `Set` って追加順やキーの序列と列挙順が順不同ですよね…\n+ パフォーマンスが理由でそうなっているのだろうとは思うのだけれど、やっぱ所謂 `OrderdDict` とか `SortedSet` 欲しい…"
},
{
"metadata": {},
"id": "b705b305",
"cell_type": "markdown",
"source": "### 解決1:外部パッケージ `DataStructures` にあるからそれ使うのが素直な解決法"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.088Z",
"end_time": "2022-12-18T12:26:32.650000+09:00"
},
"trusted": true
},
"id": "467aa74f",
"cell_type": "code",
"source": "]add DataStructures",
"execution_count": 41,
"outputs": [
{
"output_type": "stream",
"text": "\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m registry at `~/.julia/registries/General`\n\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m git-repo `[email protected]:JuliaRegistries/General.git`\n\u001b[32m\u001b[1m Resolving\u001b[22m\u001b[39m package versions...\n\u001b[32m\u001b[1m No Changes\u001b[22m\u001b[39m to `~/.julia/environments/v1.8/Project.toml`\n\u001b[32m\u001b[1m No Changes\u001b[22m\u001b[39m to `~/.julia/environments/v1.8/Manifest.toml`\n",
"name": "stderr"
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.091Z",
"end_time": "2022-12-18T12:26:33.206000+09:00"
},
"trusted": true
},
"id": "06051cb9",
"cell_type": "code",
"source": "# 標準の Dict: 順不同\ndic = Dict(name=>value for (value, name) in enumerate([\"Carol\", \"Alice\", \"Ellen\", \"Bob\", \"Dave\"]))",
"execution_count": 42,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 42,
"data": {
"text/plain": "Dict{String, Int64} with 5 entries:\n \"Carol\" => 1\n \"Alice\" => 2\n \"Dave\" => 5\n \"Ellen\" => 3\n \"Bob\" => 4"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.093Z",
"end_time": "2022-12-18T12:26:33.282000+09:00"
},
"trusted": true
},
"id": "8b77a8c7",
"cell_type": "code",
"source": "using DataStructures",
"execution_count": 43,
"outputs": []
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.097Z",
"end_time": "2022-12-18T12:26:33.806000+09:00"
},
"trusted": true
},
"id": "e5223d18",
"cell_type": "code",
"source": "# OrderedDict: 追加順\nodic = OrderedDict(name=>value for (value, name) in enumerate([\"Carol\", \"Alice\", \"Ellen\", \"Bob\", \"Dave\"]))",
"execution_count": 44,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 44,
"data": {
"text/plain": "OrderedDict{String, Int64} with 5 entries:\n \"Carol\" => 1\n \"Alice\" => 2\n \"Ellen\" => 3\n \"Bob\" => 4\n \"Dave\" => 5"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.111Z",
"end_time": "2022-12-18T12:26:34.568000+09:00"
},
"trusted": true
},
"id": "86607aaf",
"cell_type": "code",
"source": "# SortedDict: キーの昇順\nsdic = SortedDict(name=>value for (value, name) in enumerate([\"Carol\", \"Alice\", \"Ellen\", \"Bob\", \"Dave\"]))",
"execution_count": 45,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 45,
"data": {
"text/plain": "SortedDict{Any, Any, Base.Order.ForwardOrdering} with 5 entries:\n \"Alice\" => 2\n \"Bob\" => 4\n \"Carol\" => 1\n \"Dave\" => 5\n \"Ellen\" => 3"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.116Z",
"end_time": "2022-12-18T12:26:35.269000+09:00"
},
"trusted": true
},
"id": "ea339372",
"cell_type": "code",
"source": "# 標準の Set: 順不同\nset = Set([\"Carol\", \"Alice\", \"Ellen\", \"Bob\", \"Dave\"])",
"execution_count": 46,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 46,
"data": {
"text/plain": "Set{String} with 5 elements:\n \"Carol\"\n \"Alice\"\n \"Dave\"\n \"Ellen\"\n \"Bob\""
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.126Z",
"end_time": "2022-12-18T12:26:35.825000+09:00"
},
"trusted": true
},
"id": "7d7a2cac",
"cell_type": "code",
"source": "# OrderedSet: 追加順\noset = OrderedSet([\"Carol\", \"Alice\", \"Ellen\", \"Bob\", \"Dave\"])",
"execution_count": 47,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 47,
"data": {
"text/plain": "OrderedSet{String} with 5 elements:\n \"Carol\"\n \"Alice\"\n \"Ellen\"\n \"Bob\"\n \"Dave\""
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.130Z",
"end_time": "2022-12-18T12:26:36.487000+09:00"
},
"trusted": true
},
"id": "f6ab2585",
"cell_type": "code",
"source": "# SortedSet: 値の昇順\nsset = SortedSet([\"Carol\", \"Alice\", \"Ellen\", \"Bob\", \"Dave\"])",
"execution_count": 48,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 48,
"data": {
"text/plain": "SortedSet{String, Base.Order.ForwardOrdering} with 5 elements:\n \"Alice\"\n \"Bob\"\n \"Carol\"\n \"Dave\"\n \"Ellen\""
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "2d4c45c7",
"cell_type": "markdown",
"source": "### 解決2:別に `Vector` でキーの列を保持する"
},
{
"metadata": {},
"id": "55ad525f",
"cell_type": "markdown",
"source": "※ その分メモリを消費するが、列挙時にだけ追加順/ソート順が欲しいならそれで十分"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.137Z",
"end_time": "2022-12-18T12:26:36.530000+09:00"
},
"trusted": true
},
"id": "f16d0d59",
"cell_type": "code",
"source": "dkeys = [\"Carol\", \"Alice\", \"Ellen\", \"Bob\", \"Dave\"]\ndic = Dict(key=>value for (value, key) in enumerate(dkeys))",
"execution_count": 49,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 49,
"data": {
"text/plain": "Dict{String, Int64} with 5 entries:\n \"Carol\" => 1\n \"Alice\" => 2\n \"Dave\" => 5\n \"Ellen\" => 3\n \"Bob\" => 4"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.141Z",
"end_time": "2022-12-18T12:26:36.537000+09:00"
},
"trusted": true
},
"id": "d8e675d6",
"cell_type": "code",
"source": "# 追加順に列挙\nfor key in dkeys\n println(\"$(key) => $(dic[key])\")\nend",
"execution_count": 50,
"outputs": [
{
"output_type": "stream",
"text": "Carol => 1\nAlice => 2\nEllen => 3\nBob => 4\nDave => 5\n",
"name": "stdout"
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.145Z",
"end_time": "2022-12-18T12:26:36.545000+09:00"
},
"trusted": true
},
"id": "3851bd4a",
"cell_type": "code",
"source": "# キーの昇順に列挙\nfor key in sort(dkeys)\n println(\"$(key) => $(dic[key])\")\nend",
"execution_count": 51,
"outputs": [
{
"output_type": "stream",
"text": "Alice => 2\nBob => 4\nCarol => 1\nDave => 5\nEllen => 3\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"id": "d6530ff7",
"cell_type": "markdown",
"source": "※ ↑の「解決2」でもそんなに困らない(長さが膨大でなければ)"
},
{
"metadata": {},
"id": "ff65632d",
"cell_type": "markdown",
"source": "### 参考:`DataStructures` パッケージに含まれるその他のデータ構造"
},
{
"metadata": {},
"id": "c170f9f5",
"cell_type": "markdown",
"source": "+ `Stack`, `Queue`, `Deque`, `LinkedList` などの基本的なデータ構造\n+ `RBTree`, `AVLTree`, `SplayTree` 等の木構造\n+ `RobinDict`, `SwissDict` などの内部アルゴリズム(ハッシュアルゴリズムなど)に特定のものを採用した辞書\n+ その他色々(詳細割愛)"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.151Z",
"end_time": "2022-12-18T12:26:36.755000+09:00"
},
"trusted": true
},
"id": "7f0595e4",
"cell_type": "code",
"source": "# LinkedList の例\nlst = list(1, 2, 3) # `lst = cons(1, cons(2, cons(3, nil())))` と書いても同じ",
"execution_count": 52,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 52,
"data": {
"text/plain": "list(1, 2, 3)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "41adb7e3",
"cell_type": "markdown",
"source": "## 無名関数の多重定義"
},
{
"metadata": {},
"id": "e5a77702",
"cell_type": "markdown",
"source": "疑問:\n\n+ `fn = x -> x*x` みたいにして関数を定義すると、`fn(x, y)` のような呼び出しは失敗しますよね。多重定義されていないから。\n+ `fn(x) = x*x` という定義にすれば良い、そうすれば後で `fn(x, y) = ~` と多重定義できる、それは分かります。\n+ では、前者の方法で定義した関数(=無名関数)は多重定義できないの?"
},
{
"metadata": {},
"id": "c8a9fd6f",
"cell_type": "markdown",
"source": "### 解決:できます!"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.157Z",
"end_time": "2022-12-18T12:26:36.835000+09:00"
},
"trusted": true
},
"id": "1685c35b",
"cell_type": "code",
"source": "fn = x -> x * x",
"execution_count": 53,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 53,
"data": {
"text/plain": "#32 (generic function with 1 method)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.160Z",
"end_time": "2022-12-18T12:26:36.835000+09:00"
},
"trusted": true
},
"id": "9be319d6",
"cell_type": "code",
"source": "# 関数 `fn`(正確には変数 `fn` に格納されている無名関数)の多重定義\n(::typeof(fn))(x, y) = x * y",
"execution_count": 54,
"outputs": []
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.164Z",
"end_time": "2022-12-18T12:26:37.405000+09:00"
},
"trusted": true
},
"id": "2b54c18a",
"cell_type": "code",
"source": "methods(fn)",
"execution_count": 55,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 55,
"data": {
"text/plain": "# 2 methods for anonymous function \"#32\":\n[1] (::var\"#32#33\")(x) in Main at In[53]:1\n[2] (::var\"#32#33\")(x, y) in Main at In[54]:2",
"text/html": "# 2 methods for anonymous function <b>#32</b>:<ul><li> (::<b>var\"#32#33\"</b>)(x) in Main at In[53]:1</li> <li> (::<b>var\"#32#33\"</b>)(x, y) in Main at In[54]:2</li> </ul>"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.169Z",
"end_time": "2022-12-18T12:26:37.416000+09:00"
},
"trusted": true
},
"id": "dd230a02",
"cell_type": "code",
"source": "@show fn(5) fn(5, 8)",
"execution_count": 56,
"outputs": [
{
"output_type": "stream",
"text": "fn(5) = 25\nfn(5, 8) = 40\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"execution_count": 56,
"data": {
"text/plain": "40"
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "a3fd3fa5",
"cell_type": "markdown",
"source": "※ 重要:できるけれどほぼ使い途はありません!"
},
{
"metadata": {},
"id": "73c64d56",
"cell_type": "markdown",
"source": "## 文字列のインデックス"
},
{
"metadata": {},
"id": "5201f410",
"cell_type": "markdown",
"source": "疑問:\n\n+ Juliaの(配列などの)インデックスが 1-origin なのはもう慣れるしかないのでOKです。\n+ 文字列(非ASCII文字を含む)のインデックスが連続じゃないのがどうしても慣れないです…。\n+ 例えば「`s = \"123ABCあいう😁漢字\"` に対して「`s`の10文字目」と指定して「`'😁'`」を取得したい…。"
},
{
"metadata": {},
"id": "b6f3c39e",
"cell_type": "markdown",
"source": "### 解決1:`nextind()` という関数が用意されています!"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.175Z",
"end_time": "2022-12-18T12:26:38.396000+09:00"
},
"trusted": true
},
"id": "5585c560",
"cell_type": "code",
"source": "?nextind",
"execution_count": 57,
"outputs": [
{
"output_type": "stream",
"text": "search: \u001b[0m\u001b[1mn\u001b[22m\u001b[0m\u001b[1me\u001b[22m\u001b[0m\u001b[1mx\u001b[22m\u001b[0m\u001b[1mt\u001b[22m\u001b[0m\u001b[1mi\u001b[22m\u001b[0m\u001b[1mn\u001b[22m\u001b[0m\u001b[1md\u001b[22m I\u001b[0m\u001b[1mn\u001b[22md\u001b[0m\u001b[1me\u001b[22m\u001b[0m\u001b[1mx\u001b[22mCar\u001b[0m\u001b[1mt\u001b[22mes\u001b[0m\u001b[1mi\u001b[22ma\u001b[0m\u001b[1mn\u001b[22m Missi\u001b[0m\u001b[1mn\u001b[22mg\u001b[0m\u001b[1mE\u001b[22m\u001b[0m\u001b[1mx\u001b[22mcep\u001b[0m\u001b[1mt\u001b[22m\u001b[0m\u001b[1mi\u001b[22mo\u001b[0m\u001b[1mn\u001b[22m curre\u001b[0m\u001b[1mn\u001b[22mt_\u001b[0m\u001b[1me\u001b[22m\u001b[0m\u001b[1mx\u001b[22mcep\u001b[0m\u001b[1mt\u001b[22m\u001b[0m\u001b[1mi\u001b[22mo\u001b[0m\u001b[1mn\u001b[22ms\n\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"execution_count": 57,
"data": {
"text/plain": "\u001b[36m nextind(str::AbstractString, i::Integer, n::Integer=1) -> Int\u001b[39m\n\n • Case \u001b[36mn == 1\u001b[39m\n If \u001b[36mi\u001b[39m is in bounds in \u001b[36ms\u001b[39m return the index of the start of the\n character whose encoding starts after index \u001b[36mi\u001b[39m. In other words, if\n \u001b[36mi\u001b[39m is the start of a character, return the start of the next\n character; if \u001b[36mi\u001b[39m is not the start of a character, move forward\n until the start of a character and return that index. If \u001b[36mi\u001b[39m is\n equal to \u001b[36m0\u001b[39m return \u001b[36m1\u001b[39m. If \u001b[36mi\u001b[39m is in bounds but greater or equal to\n \u001b[36mlastindex(str)\u001b[39m return \u001b[36mncodeunits(str)+1\u001b[39m. Otherwise throw\n \u001b[36mBoundsError\u001b[39m.\n\n • Case \u001b[36mn > 1\u001b[39m\n Behaves like applying \u001b[36mn\u001b[39m times \u001b[36mnextind\u001b[39m for \u001b[36mn==1\u001b[39m. The only\n difference is that if \u001b[36mn\u001b[39m is so large that applying \u001b[36mnextind\u001b[39m would\n reach \u001b[36mncodeunits(str)+1\u001b[39m then each remaining iteration increases\n the returned value by \u001b[36m1\u001b[39m. This means that in this case \u001b[36mnextind\u001b[39m can\n return a value greater than \u001b[36mncodeunits(str)+1\u001b[39m.\n\n • Case \u001b[36mn == 0\u001b[39m\n Return \u001b[36mi\u001b[39m only if \u001b[36mi\u001b[39m is a valid index in \u001b[36ms\u001b[39m or is equal to \u001b[36m0\u001b[39m.\n Otherwise \u001b[36mStringIndexError\u001b[39m or \u001b[36mBoundsError\u001b[39m is thrown.\n\n\u001b[1m Examples\u001b[22m\n\u001b[1m ≡≡≡≡≡≡≡≡≡≡\u001b[22m\n\n\u001b[36m julia> nextind(\"α\", 0)\u001b[39m\n\u001b[36m 1\u001b[39m\n\u001b[36m \u001b[39m\n\u001b[36m julia> nextind(\"α\", 1)\u001b[39m\n\u001b[36m 3\u001b[39m\n\u001b[36m \u001b[39m\n\u001b[36m julia> nextind(\"α\", 3)\u001b[39m\n\u001b[36m ERROR: BoundsError: attempt to access 2-codeunit String at index [3]\u001b[39m\n\u001b[36m [...]\u001b[39m\n\u001b[36m \u001b[39m\n\u001b[36m julia> nextind(\"α\", 0, 2)\u001b[39m\n\u001b[36m 3\u001b[39m\n\u001b[36m \u001b[39m\n\u001b[36m julia> nextind(\"α\", 1, 2)\u001b[39m\n\u001b[36m 4\u001b[39m",
"text/markdown": "```\nnextind(str::AbstractString, i::Integer, n::Integer=1) -> Int\n```\n\n * Case `n == 1`\n\n If `i` is in bounds in `s` return the index of the start of the character whose encoding starts after index `i`. In other words, if `i` is the start of a character, return the start of the next character; if `i` is not the start of a character, move forward until the start of a character and return that index. If `i` is equal to `0` return `1`. If `i` is in bounds but greater or equal to `lastindex(str)` return `ncodeunits(str)+1`. Otherwise throw `BoundsError`.\n * Case `n > 1`\n\n Behaves like applying `n` times `nextind` for `n==1`. The only difference is that if `n` is so large that applying `nextind` would reach `ncodeunits(str)+1` then each remaining iteration increases the returned value by `1`. This means that in this case `nextind` can return a value greater than `ncodeunits(str)+1`.\n * Case `n == 0`\n\n Return `i` only if `i` is a valid index in `s` or is equal to `0`. Otherwise `StringIndexError` or `BoundsError` is thrown.\n\n# Examples\n\n```jldoctest\njulia> nextind(\"α\", 0)\n1\n\njulia> nextind(\"α\", 1)\n3\n\njulia> nextind(\"α\", 3)\nERROR: BoundsError: attempt to access 2-codeunit String at index [3]\n[...]\n\njulia> nextind(\"α\", 0, 2)\n3\n\njulia> nextind(\"α\", 1, 2)\n4\n```\n",
"text/latex": "\\begin{verbatim}\nnextind(str::AbstractString, i::Integer, n::Integer=1) -> Int\n\\end{verbatim}\n\\begin{itemize}\n\\item Case \\texttt{n == 1}\n\nIf \\texttt{i} is in bounds in \\texttt{s} return the index of the start of the character whose encoding starts after index \\texttt{i}. In other words, if \\texttt{i} is the start of a character, return the start of the next character; if \\texttt{i} is not the start of a character, move forward until the start of a character and return that index. If \\texttt{i} is equal to \\texttt{0} return \\texttt{1}. If \\texttt{i} is in bounds but greater or equal to \\texttt{lastindex(str)} return \\texttt{ncodeunits(str)+1}. Otherwise throw \\texttt{BoundsError}.\n\n\n\\item Case \\texttt{n > 1}\n\nBehaves like applying \\texttt{n} times \\texttt{nextind} for \\texttt{n==1}. The only difference is that if \\texttt{n} is so large that applying \\texttt{nextind} would reach \\texttt{ncodeunits(str)+1} then each remaining iteration increases the returned value by \\texttt{1}. This means that in this case \\texttt{nextind} can return a value greater than \\texttt{ncodeunits(str)+1}.\n\n\n\\item Case \\texttt{n == 0}\n\nReturn \\texttt{i} only if \\texttt{i} is a valid index in \\texttt{s} or is equal to \\texttt{0}. Otherwise \\texttt{StringIndexError} or \\texttt{BoundsError} is thrown.\n\n\\end{itemize}\n\\section{Examples}\n\\begin{verbatim}\njulia> nextind(\"α\", 0)\n1\n\njulia> nextind(\"α\", 1)\n3\n\njulia> nextind(\"α\", 3)\nERROR: BoundsError: attempt to access 2-codeunit String at index [3]\n[...]\n\njulia> nextind(\"α\", 0, 2)\n3\n\njulia> nextind(\"α\", 1, 2)\n4\n\\end{verbatim}\n"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.188Z",
"end_time": "2022-12-18T12:26:38.397000+09:00"
},
"trusted": true
},
"id": "ba56f312",
"cell_type": "code",
"source": "s = \"123ABCあいう😁漢字\"",
"execution_count": 58,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 58,
"data": {
"text/plain": "\"123ABCあいう😁漢字\""
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.191Z",
"end_time": "2022-12-18T12:26:38.397000+09:00"
},
"trusted": true
},
"id": "4e164ef0",
"cell_type": "code",
"source": "nextind(s, 0, 10) # インデックス番号 `0` より後で `10` 番目の文字のインデックス",
"execution_count": 59,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 59,
"data": {
"text/plain": "16"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.193Z",
"end_time": "2022-12-18T12:26:38.675000+09:00"
},
"trusted": true
},
"id": "eda5afae",
"cell_type": "code",
"source": "s[nextind(s, 0, 10)] # `s` の10文字目",
"execution_count": 60,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 60,
"data": {
"text/plain": "'😁': Unicode U+1F601 (category So: Symbol, other)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "599b36c9",
"cell_type": "markdown",
"source": "### 参考1:他の `xxxxind()` 系の関数"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.196Z",
"end_time": "2022-12-18T12:26:38.685000+09:00"
},
"trusted": true
},
"id": "6324a002",
"cell_type": "code",
"source": "# `thisind()`: 指定したインデックス番号あたりに存在する文字の正しいインデックス番号を返す\ns = \"123ABCあいう😁漢字\"\n@show thisind(s, 9)\n@show s[thisind(s, 9)]",
"execution_count": 61,
"outputs": [
{
"output_type": "stream",
"text": "thisind(s, 9) = 7\ns[thisind(s, 9)] = 'あ'\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"execution_count": 61,
"data": {
"text/plain": "'あ': Unicode U+3042 (category Lo: Letter, other)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.198Z",
"end_time": "2022-12-18T12:26:38.689000+09:00"
},
"trusted": true
},
"id": "36f18b01",
"cell_type": "code",
"source": "# `prevind()`: 指定したインデックス番号より前に存在する文字の正しいインデックス番号を返す\ns = \"123ABCあいう😁漢字\"\n@show prevind(s, 16)\n@show s[prevind(s, 16)]",
"execution_count": 62,
"outputs": [
{
"output_type": "stream",
"text": "prevind(s, 16) = 13\ns[prevind(s, 16)] = 'う'\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"execution_count": 62,
"data": {
"text/plain": "'う': Unicode U+3046 (category Lo: Letter, other)"
},
"metadata": {}
}
]
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.201Z",
"end_time": "2022-12-18T12:26:38.691000+09:00"
},
"trusted": true
},
"id": "b0cc246f",
"cell_type": "code",
"source": "# `prevind()` を利用すると「後ろから何文字目」という指定が可能\ns = \"123ABCあいう😁漢字\"\n@show prevind(s, lastindex(s)+1, 3) # 後ろから3文字目の文字のインデックス\n@show s[prevind(s, end+1, 3)] # 後ろから3文字目の文字",
"execution_count": 63,
"outputs": [
{
"output_type": "stream",
"text": "prevind(s, lastindex(s) + 1, 3) = 16\ns[prevind(s, end + 1, 3)] = '😁'\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"execution_count": 63,
"data": {
"text/plain": "'😁': Unicode U+1F601 (category So: Symbol, other)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "9a597c05",
"cell_type": "markdown",
"source": "### 参考2:`eachindex()` 関数"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.205Z",
"end_time": "2022-12-18T12:26:38.722000+09:00"
},
"trusted": true
},
"id": "c0ad4715",
"cell_type": "code",
"source": "# 文字列の全てのインデックスを列挙するなら↓\ns = \"123ABCあいう😁漢字\"\n@show eachindex(s);\n@show collect(eachindex(s));",
"execution_count": 64,
"outputs": [
{
"output_type": "stream",
"text": "eachindex(s) = Base.EachStringIndex{String}(\"123ABCあいう😁漢字\")\ncollect(eachindex(s)) = [1, 2, 3, 4, 5, 6, 7, 10, 13, 16, 20, 23]\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"id": "0ab37921",
"cell_type": "markdown",
"source": "### 参考3:`split()` しても似たようなことはできますが…"
},
{
"metadata": {
"ExecuteTime": {
"start_time": "2022-12-18T03:25:37.208Z",
"end_time": "2022-12-18T12:26:38.988000+09:00"
},
"trusted": true
},
"id": "bd1e34e0",
"cell_type": "code",
"source": "s = \"123ABCあいう😁漢字\"\nsplit(s, \"\")[10]",
"execution_count": 65,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 65,
"data": {
"text/plain": "\"😁\""
},
"metadata": {}
}
]
},
{
"metadata": {},
"id": "e86e985d",
"cell_type": "markdown",
"source": "※この場合以下に注意:\n\n+ 結果は文字(`Char`)ではなく文字列(`SubString{String}`)になる\n+ 一時的に文字列(部分文字列)の配列が生成されるのでパフォーマンスは決して良くない(使い方次第)"
},
{
"metadata": {
"trusted": true
},
"id": "5bb23cf8",
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
}
],
"metadata": {
"kernelspec": {
"name": "julia-1.8",
"display_name": "Julia 1.8.3",
"language": "julia"
},
"language_info": {
"file_extension": ".jl",
"name": "julia",
"mimetype": "application/julia",
"version": "1.8.3"
},
"gist": {
"id": "0636c79fbdbcb322b6f4dac3f32c0616",
"data": {
"description": "細かすぎて伝わらないかもしれないJuliaのTips.jl.ipynb",
"public": true
}
},
"_draft": {
"nbviewer_url": "https://gist.github.com/0636c79fbdbcb322b6f4dac3f32c0616"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment