Skip to content

Instantly share code, notes, and snippets.

View javagrails's full-sized avatar
💗
Coding - Coding - Coding & Coding

Salman* javagrails

💗
Coding - Coding - Coding & Coding
View GitHub Profile
@javagrails
javagrails / setup-fusion-cluster.sh
Created August 18, 2021 18:39 — forked from nddipiazza/setup-fusion-cluster.sh
Install a Fusion cluster
# Run this script to install a Fusion cluster locally.
#
# In the working directory you are in, it will create fusion-1, fusion-2, etc... directories.
#
# You will then take those directories and either run them from the same machine, or you can copy the directories to separate instances.
#
# There are two optional command line properties:
#
# --no-download Do not download Fusion from https://download.lucidworks.com instead use the tar.gz file in this directory already.
# -v Verbose mode.
@javagrails
javagrails / README.md
Created August 18, 2021 18:43 — forked from kordless/README.md
Indexing XKCD with Lucidwork's Fusion and Google Image API

Overview

This Seed Streams guide illustrates how to use Lucidworks Fusion to crawl a specific set of documents on a website whose URIs match a regular expression. Additionally, img src fields are extracted with a JavaScript parsing stage and inserted into the index for use in other indexing stages. A vision network may be utilized to extract additional fields from the images.

Start Fusion and Create a New Appliction

  1. Start a Fusion instance on Google. Click the link the script outputs to navigate to the Fusion instance page. Set a password. Login with admin and the new password.
  2. Create a new application. Call it XKCD.
  3. Click on the new application.

Add a New Datasource and Limit the Documents

  1. Create a new datasource under Indexing..Datasources. Add a web source. Add https://xkcd.com a
#!/bin/bash
NEW_UUID=$(cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 4 | head -n 1)
SERVER_NAME=ubuntu-dev-$NEW_UUID
gcloud compute instances create $SERVER_NAME \
--machine-type "n1-standard-1" \
--image "ubuntu-1604-xenial-v20170811" \
--image-project "ubuntu-os-cloud" \
--boot-disk-size "10" \
--boot-disk-type "pd-ssd" \
--boot-disk-device-name "$NEW_UUID" \
@javagrails
javagrails / start-fusion.sh
Created August 18, 2021 18:44 — forked from kordless/start-fusion.sh
Script to start Lucidworks Fusion from Google Cloud Console
#!/bin/bash
NEW_UUID=$(cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 4 | head -n 1)
gcloud compute instances create fusion-server-$NEW_UUID \
--machine-type "n1-standard-8" \
--image "ubuntu-1604-xenial-v20170811" \
--image-project "ubuntu-os-cloud" \
--boot-disk-size "50" \
--boot-disk-type "pd-ssd" \
--boot-disk-device-name "$NEW_UUID" \
--zone us-west1-b \