Created
March 14, 2024 20:52
-
-
Save yukiarimo/8be25767362f580434aec0fc39504d3d to your computer and use it in GitHub Desktop.
Web novel Downloader
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
function extractAndDownloadAllChapters() { | |
// Find all containers that hold chapter text | |
const chapterContainers = document.querySelectorAll('.cha-words'); | |
// Initialize an array to hold all chapter texts | |
let allChaptersText = []; | |
// Iterate over each chapter container | |
chapterContainers.forEach(container => { | |
// Get all paragraph elements within the container | |
const paragraphs = container.querySelectorAll('p'); | |
// Extract the text from each paragraph and join them with a newline character | |
const chapterText = Array.from(paragraphs).map(p => p.textContent.trim()).join('\n'); | |
// Add the chapter text to the array | |
allChaptersText.push(chapterText); | |
}); | |
// Join all chapters with two newline characters to separate them | |
const allText = allChaptersText.join('\n\n'); | |
// Create a Blob with the combined text content | |
const blob = new Blob([allText], { | |
type: 'text/plain' | |
}); | |
// Create an anchor element and use it to trigger the download | |
const anchor = document.createElement('a'); | |
anchor.href = URL.createObjectURL(blob); | |
anchor.download = 'allChaptersText.txt'; | |
document.body.appendChild(anchor); | |
anchor.click(); | |
document.body.removeChild(anchor); | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, I’m facing an issue where the text downloaded using my script is coming out in Chinese for the initial chapters, even though it is supposed to be in English. After scrolling down on the page, I noticed that the text in the later chapters is in English as expected.
It seems like the content for earlier chapters might not be fully loaded when the script runs. Could this be related to lazy loading or dynamic content that only appears as I scroll? If so, how can I ensure all the content is properly loaded before extracting it?
Any guidance would be greatly appreciated. Thank you!