Skip to content

Instantly share code, notes, and snippets.

@VyrCossont
Last active January 27, 2023 03:43
Show Gist options
  • Save VyrCossont/b0b7af4566ab652a0f6da7fcabc56e65 to your computer and use it in GitHub Desktop.
Save VyrCossont/b0b7af4566ab652a0f6da7fcabc56e65 to your computer and use it in GitHub Desktop.
filter by users and sort by date in Mastodon. see https://demon.social/@vyr/109296080568061011

new search features

  • setting SEARCH_ALL_VISIBLE_TOOTS=true in your instance config will let users search all non-private posts, not just the ones they've interacted with.
  • prefixing a search query with πŸ” will use ES simple query string syntax instead of Mastodon's weird broken subset of it, which will let you use "double quotes" for literal strings including white space, + to force a term to be included, or - to force a term to be excluded.
  • prefixing a search query with πŸ”Ž will use ES regular query string syntax which is like simple query syntax, except:
    • syntax errors will cause the search to fail, so if you don't get anything back, that's why.
    • you can specify individual fields to match on. if you don't specify a field for a given search term, it does a normal full-text search for it.
      • for example, πŸ”Ž macOS acct:vyr, if run by a demon.social user, would find all the times i complained about Apple's desktops.
    • the most useful one is acct, which is a username for users on the current instance and a username@domain for users of any other instance.
    • username is the part before the @, such as vyr
    • domain is the part after the @, such as demon.social. note that this is set for both local and remote users, so you can search for posts from your own instance.
    • created_at is a date field used for sorting posts by date as described below.
  • following πŸ” or πŸ”Ž immediately with πŸ“ˆ or πŸ“‰ will sort the results by oldest to newest or newest to oldest respectively

deploying this patch

make backups

take an instance VM snapshot or whatever. or don't, i'm not your mom.

apply the patch

cd your/mastodon/checkout/directory
git apply path/to/mastodon-advanced-search.patch

enable the new feature

Add this line to your .env.production config file:

SEARCH_ALL_VISIBLE_TOOTS=true

restart the Mastodon web server

sudo systemctl restart mastodon-web.service

reindex all posts

add the new index fields to ES so you can filter by user and sort by date. you only need to do this once. it's supposed to be zero downtime, although it will burn a lot of disk and CPU as it goes through every post on your server again. consequentially, it'll take a while, so run it in tmux or whatever you use to keep long-running one-off jobs from dying when you close the terminal.

# you need this gem to do fast reindexing
bundle add parallel

RAILS_ENV=production PROGRESS=true bin/rake 'chewy:parallel:reset[statuses]'

note that this results in the statuses index becoming an alias backed by the newly reindexed data in a new index named something like statuses_1615749400942 (appending the current date), as an artifact of Chewy zero-downtime reindexing. shouldn't change how anything works, but if it annoys you, you can invoke the ES reindex or clone APIs (depending on version) to copy the new index back to statuses. this is entirely optional.

diff --git a/app/chewy/statuses_index.rb b/app/chewy/statuses_index.rb
index 6dd4fb18b..dc2113e1d 100644
--- a/app/chewy/statuses_index.rb
+++ b/app/chewy/statuses_index.rb
@@ -65,6 +65,10 @@ class StatusesIndex < Chewy::Index
root date_detection: false do
field :id, type: 'long'
field :account_id, type: 'long'
+ field :acct, type: 'keyword', value: ->(status) { status.account.acct }
+ field :username, type: 'keyword', value: ->(status) { status.account.username }
+ field :domain, type: 'keyword', value: ->(status) { status.account.domain or Rails.configuration.x.local_domain }
+ field :created_at, type: 'date'
field :text, type: 'text', value: ->(status) { status.searchable_text } do
field :stemmed, type: 'text', analyzer: 'content'
diff --git a/app/services/search_service.rb b/app/services/search_service.rb
index 1a76cbb38..17a6d51c5 100644
--- a/app/services/search_service.rb
+++ b/app/services/search_service.rb
@@ -1,6 +1,9 @@
# frozen_string_literal: true
class SearchService < BaseService
+
+ SEARCH_ALL_VISIBLE_TOOTS = ENV['SEARCH_ALL_VISIBLE_TOOTS'] == 'true'
+
def call(query, account, limit, options = {})
@query = query&.strip
@account = account
@@ -35,7 +38,59 @@ class SearchService < BaseService
end
def perform_statuses_search!
- definition = parsed_query.apply(StatusesIndex.filter(term: { searchable_by: @account.id }))
+ statuses_index = StatusesIndex
+ statuses_index = statuses_index.filter(term: { searchable_by: @account.id }) unless SEARCH_ALL_VISIBLE_TOOTS
+ if @query.start_with?('πŸ”')
+ # simple query string: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/query-dsl-simple-query-string-query.html
+ query_sort_text = @query.delete_prefix('πŸ”').strip
+ if query_sort_text.start_with?('πŸ“ˆ')
+ query_text = query_sort_text.delete_prefix('πŸ“ˆ').strip
+ order_by_date = 'asc'
+ elsif query_sort_text.start_with?('πŸ“‰')
+ query_text = query_sort_text.delete_prefix('πŸ“‰').strip
+ order_by_date = 'desc'
+ else
+ query_text = query_sort_text
+ order_by_date = nil
+ end
+
+ definition = statuses_index.query {
+ simple_query_string {
+ query query_text
+ fields ['text']
+ default_operator 'AND'
+ }
+ }
+ if order_by_date
+ definition = definition.order(created_at: order_by_date)
+ end
+ elsif @query.start_with?('πŸ”Ž')
+ # query string: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/query-dsl-query-string-query.html
+ query_sort_text = @query.delete_prefix('πŸ”Ž').strip
+ if query_sort_text.start_with?('πŸ“ˆ')
+ query_text = query_sort_text.delete_prefix('πŸ“ˆ').strip
+ order_by_date = 'asc'
+ elsif query_sort_text.start_with?('πŸ“‰')
+ query_text = query_sort_text.delete_prefix('πŸ“‰').strip
+ order_by_date = 'desc'
+ else
+ query_text = query_sort_text
+ order_by_date = nil
+ end
+
+ definition = statuses_index.query {
+ query_string {
+ query query_text
+ default_field 'text'
+ default_operator 'AND'
+ }
+ }
+ if order_by_date
+ definition = definition.order(created_at: order_by_date)
+ end
+ else
+ definition = parsed_query.apply(statuses_index).order(created_at: :desc)
+ end
if @options[:account_id].present?
definition = definition.filter(term: { account_id: @options[:account_id] })
@VyrCossont
Copy link
Author

Almost entirely superseded by VyrCossont/mastodon#2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment