Skip to content

Instantly share code, notes, and snippets.

@robertmryan
Created October 9, 2024 15:06
Show Gist options
  • Save robertmryan/0cb300bdbcd560cfa999b817e8f71999 to your computer and use it in GitHub Desktop.
Save robertmryan/0cb300bdbcd560cfa999b817e8f71999 to your computer and use it in GitHub Desktop.
func upload(files: [URL]) async throws -> Bool {
try await withThrowingTaskGroup(of: Bool.self) { group in
let session = try SSH(host: "xxx", port:"xxx")
let sftp = try session.openSftp()
for file in files {
group.addTask {
try await self.upload(file: file, with: sftp, session: session)
}
}
var count = 0
for try await isSuccessful in group where isSuccessful {
count += 1
}
return count == files.count
}
}
@robertmryan
Copy link
Author

Above and beyond the “make sure the objects session and sftp are Sendable, a few other observations:

  1. I personally would generally move session and sftp into the task group. No reason to make them task-isolated. And as a general principle, I try to keep variables to the narrowest possible scope possible. It makes it easier to reason about code at a glance and reduces the possibility of unintended sharing.

  2. The use of [weak self] (with associated guard let self else {…}) is unnecessary noise. There is no strong reference cycle here. And as soon as this starts, it establishes a strong reference to self, anyway. Maybe if we knew your intent with [weak self] syntax we could comment further, but this pattern is less useful. See async/await, Task and [weak self].

  3. The original code snippet built an array of booleans that were not in the order of files and then count the successes. I would just count the successes directly, if that was your intent, and lose this [Bool].

  4. I didn’t address it above, but there is a potential little dissonance in the design of upload(file:with:session:). Does it return a Boolean on failure or throw errors? Usually you would pick one or the other (usually just throw errors on failure and not return a Bool, so the caller can potentially know why it failed). The above upload(files:) will return false if any of the upload(file:with:session:) calls returned false, but will continue attempting all the other uploads, whereas if any of the upload(file:with:session:) calls threw an error, this will not finish the other uploads and will throw an error, itself. That is curious.

@robertmryan
Copy link
Author

robertmryan commented Oct 9, 2024

Re point 4, above, if you lose the Bool return values and just throw on errors, the code is simplified:

func upload(files: [URL]) async throws {
    try await withThrowingTaskGroup(of: Void.self) { group in
        let session = try SSH(host: "xxx", port:"xxx")
        let sftp = try session.openSftp()

        for file in files {
            group.addTask {
                try await self.upload(file: file, with: sftp, session: session)
            }
        }

        try await group.waitForAll() // whoops, I neglected to include this
    }
}

private func upload(file: URL, with sftp: SFTP, session: SSH) async throws {
    
}

But this begs the question: What do you want to happen if an upload fails? Continue trying the others or stop trying? And if one or more failed, do you want to know which ones failed?

@williamhqs
Copy link

williamhqs commented Oct 10, 2024

Thanks for the comment!

Let met attached my code here a bit more just for better explanation :)

import Foundation
@preconcurrency import Shout

actor SftpUploader {
    
    typealias SFTPKeyURL = (privateKey: URL, publicKey: URL)
    
    func upload(files: [URL]) async throws -> Bool {
        guard isGenerateKeysFileSuccess else {
            return false
        }
        let results = try await withThrowingTaskGroup(of: Bool.self, returning: [Bool].self) { [weak self] group in
            guard let self = self else { return [] }
            let session = try SSH(host: "xxx", port: xxx)
            let sshKey = SSHKey(privateKey: self.keyURLs.privateKey.path,
                                publicKey: self.keyURLs.publicKey.path,
                                passphrase: "")
            try session.authenticate(username: "xxx", authMethod: sshKey)
            let sftp = try session.openSftp()
            for file in files {
                group.addTask {
                    return try await self.upload(file: file, with: sftp, session: session)
                }
            }
            return try await group.reduce(into: [Bool]()) { partialResult, isSuccess in
                partialResult.append(isSuccess)
            }
        }
        return results.count == files.count
    }
}

Point 1: Agreed. I moved into task group. ✔️
Point 2: What I'm considering is when the SftupUploader is released/deallocated for some reason, The tasks might not be finished that is in progress. In that case not quite sure if the self was used in the closure can called, I'm worrying if the app crash. Please correct me if my worrying isn't necessary or wrong.
Point3: yes, I was thought that too. The thing is I think I use reduce(into:) is better then for...in and count += 1 stuff. But Return [Bool].self isn't as good as without return value. I do agree. Thanks ✔️
Point4: I can't agree with you more. Bool isn't used actually. I am refactor the legacy code and leave the "old idea" there. A bit shame, I should keep just use throw or bool.

Actually I faced a problem, if any upload throw errors, how to cancel it? Any suggestions?

func upload(files: [URL]) async throws {
        guard isGenerateKeysFileSuccess else {
            return
        }
        try await withThrowingTaskGroup(of: Bool.self) { group in
            let session = try SSH(host: "sftp.atplatform.com.cn", port: 44022)
            let sshKey = SSHKey(privateKey: self.keyURLs.privateKey.path,
                                publicKey: self.keyURLs.publicKey.path,
                                passphrase: "")
            try session.authenticate(username: "atpsftp", authMethod: sshKey)
            let sftp = try session.openSftp()
            for file in files {
                group.addTask {
                    do {
                        return try await self.upload(file: file, with: sftp, session: session)
                    } catch {
//                       group.cancelAll() --> ERROR: Escaping closure captures 'inout' parameter 'group'
                        throw error
                    }
                    
                }
            }
        }
        cleanKeyFiles()
    }


This is also the code after refactor by your four review points. thanks again. Appreciate any suggestions 👍

@williamhqs
Copy link

Back to my issues. After moved the parameter inside the task the errors turns into

Value of non-Sendable type '@isolated(any) @async @callee_guaranteed @substituted <τ_0_0> () -> (@out τ_0_0, @error any Error) for <Bool>' accessed after being transferred; later accesses could race; this is an error in the Swift 6 language mode

Not sure about this. :(

@robertmryan
Copy link
Author

robertmryan commented Oct 10, 2024

@williamhqs

Regarding cancelation, consider:

actor Foo {
    func foo() async throws {
        try await withThrowingTaskGroup(of: Void.self) { group in
            for i in 0 ..< 20 {
                group.addTask {
                    try await self.bar(i)
                }
            }
            
            // if you use this pattern (or any pattern that results in `group.next()` to be called),
            // the whole group will be canceled as soon as any task fails.

            for try await _ in group { }
        }
    }
    
    func bar(_ value: Int) async throws {
        print("starting", #function, value)
        
        if value == 4 {
            print("failed", #function, value)
            throw FooError.foobar
        }
        
        try await Task.sleep(for: .seconds(1))
        
        print("successfully finished", #function, value)
    }
    
    enum FooError: Error {
        case foobar
    }
}

That will cancel the whole group as soon as one fails.

However, if you do something like the following, the group will not cancel all of its tasks (though the error will be thrown):

actor Foo {
    func foo() async throws {
        try await withThrowingTaskGroup(of: Void.self) { group in
            for i in 0 ..< 20 {
                group.addTask {
                    try await self.bar(i)
                }
            }
            
            // however if you use this pattern instead, the group is not. as the docs for `waitForAll` says:
            //
            // “If any of the tasks throw, the first error thrown is captured and re-thrown 
            // by this method although the task group is not cancelled when this happens.”

            try await group.waitForAll()
        }
    }
    
    func bar(_ value: Int) async throws {}
}

That is what I was trying to convey when I said:

But this begs the question: What do you want to happen if an upload fails? Continue trying the others or stop trying? And if one or more failed, do you want to know which ones failed?

I didn’t know what you wanted to happen if one failed. Hopefully this helps clarify a few of the options.


Note, this presumes that your upload method even supports cancellation. My trivial bar method does, but I can’t vouch for Shout.

@robertmryan
Copy link
Author

@williamhqs – Now, regarding your remaining error, that would appear to stem from Shout’s failure to define SSH or SFTP as Sendable. I can’t comment directly on that (as this is not a library that I’ve ever used), but given that it’s using a non-isolated class (rather than an isolated one or an actor), so that’s not promising. You might want to post an issue there. I confess that I don’t even know if it supports concurrent requests; glancing at the code, I don’t see any async functions, so that is not promising, either.

But if you post an issue there, the author might be able to comment further. But prepare yourself for the possibility that simply can’t do that. Or you might have to create a separate session for each upload. I can’t comment without doing a lot more investigation (which I am am afraid that I can’t tackle right now as I’m currently quite busy).

@williamhqs
Copy link

@robertmryan That's already a great help for me, thank you for you time, have a good day! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment