The details regarding the API input, output and data processing is available in section - Input parameters for API, Output of API and Data Processing by the API.
- Generate pdf for form responses (Priority) - A user submits a response to a custom form (may not be a Google form) and will get a pdf copy of his response.
- Generate meeting/other invitations - A custom invitation for all the individuals regarding a meeting or similar activity.
- Generate Invoices - Invoices can be generated for individual users/departments.
- Generate digital report cards for students - Report cards can be generated for individuals which can include their performance for a particular session/semester.
- Generate performance sheets for employees - Performance sheets can be generated for individuals based on their performance when evaluated on particular parameters.
- Generate sales reports for individual/departments - Sales report can be generated for individuals/departments.
- Resume generation - User can generate a resume based on a particular template.
- Homogeneous User/Employee profile generation - It can be used to generate user homogeneous profiles that can be displayed on the companies portal/website.
- User/Department Expense report generation - It can be used to generate the expense report for a particular user/department.
- Employee Salary Slip generation - Salary slips can be generated in a particular homogeneous format for all the users/employees.
- Project Milestones tracker generation - It can be used to generate a tracker for project milestones. The template would have statistical attributes. And the project would generate milestone tracker for various departments.
- Generating PDF - The API will take the template + mapping sheet + data as input and return the PDF generation status.
- If there is no other PDF in queue then it'll return PDF url.
- If there are PDF's in queue then it'll add the pdf to queue and will return the current status of request.
- Checking the status of PDF generation - The API will take input the template + mapping sheet + data and will return the PDF generation status.
- If there are PDF's in queue then it'll return the PDF status.
- If the PDF has already been generated then it'll return the PDF url.
-
Submitting same request twice - If the API receives the same request (same combination of data + mapping sheet + template) then it'll not process the request and will return the current status/pdf url.
-
Submitting same data but different template or mapping - In this case the developer calling the API should make sure that the
uuid
sent is different than the previous request else the API will not process data and will return the status/pdf url of the previous request.
Note - The API will only receive individual requests. But in case it receives multiple individual requests simultaneously then it'll add them to queue and will return the current status of request.
- Variable Mapping Spreadsheet id (variable name -
varMappingID
) - Docs template id (variable name -
docTemplateID
) - data in json format (variable name -
dataAll
). This should also containuuid
, this will be used to check the status of the ongoing requests. The datadataAll
should be in a dictionary like format {"data Header": "data Value",}
- How will
uuid
be used? - When the API receives a request it'll first check whether theuuid
exists in our database. If it exists in our database then the pdf url from the database will be returned. Else the request will be processed.
- It'll return the current status of the process or the url of the pdf if it is generated.
- Output If pdf is generated:
{
"status":"completed",
"url":"url_here"
}
- Output if the process is in queue:
{
"status":"queue"
}
The API will receive the required data and will do the following in the respective order:
- Get the
uuid
from data and check if it exists in our database. If it exists then return the status associated with it. - If the
uuid
doesn't exist then check if the data is in proper format. - Then I'll fetch the mapping data from the sheet.
- After that it'll map the variables including mapping the options. While mapping the variables it'll check and correct the urls.
- Then it'll generate the app script url for the processed data.
- Then the data urls will be queued and then sent to app script.
- The app script will then process the data and generate a new file in google drive. While the app script is processing the data it'll sent authentication tokens to fetch the images.
- Then the pdf will be downloaded from the drive.
- After all the pdf's (of all the requests in the queue) are downloaded then they all will be uploaded to GCS and it'll return the url of pdf.
- Then the pdf url is added to the database.
-
It should contain the mapping details in sheet named
mappingDetails
. -
It should have the options sheet named
optionSheet
. -
The mapping sheet (
mappingDetails
) will have variable mapping in the following format:
variableInTemplate | variableType | variableInRawData |
---|---|---|
0 | text | studentFirstName |
1 | text | studentLastName |
2 | imageLink | studentImage |
3 | text | studentRollNo |
4 | imageLink | studentSignature |
5 | options | myVariable1 |
6 | options | myVariable2 |
7 | options | myVariable3 |
- The options sheet (
optionSheet
) will have the mapping details in the following format:
myVariable1 | a::first option | b::this might | c::value | d::value | |||
---|---|---|---|---|---|---|---|
myVariable2 | a::this is an option | b::that value | c::value | d::value | e::value | f::values here | g::value available |
myVariable3 | a::that option | b::this is variable | c::value | d::value | |||
myVariable4 | a::this is answer | b::it is not true | c::value | d::value | e::values here | f::not available | g::orange |
- The template document should have the placeholders in the following format:
Student Detail Document.
The name of student is <<0>> <<1>>
The roll number of student is <<3>>
The picture of student is <<2>>
The signature of student is <<4>>
.
.
The students response to Question 1 is <<5>>
- The API will not work if any of the input parameters are entered incorrectly.
- E.g. if the template id is entered wrong.
- The correction of input parameters will be checked by trying to get data from the urls. and an exception will be returned if some error is found
error in input parameters
.
- The API will only accept post requests.
- It will not work if the authenticated account did not have the readonly permission for the Spreadsheet and the read write permission to the google drive of the authenticated account (The template document will be present in this drive).
- User authentication is not being done at this point. So the user has to make and account (authenticated account) the owner of the template id.
- The API will not work if the mapping is incorrect, means a variable mapped in the mapping sheet to a value which is not present in the variable dictionary
dataAll
. It will return an errorwrong mapping
. - It will not work if the app script is installed on some other drive than that of authenticated account.
- It will work only if the app script is published as a web app and has the following scopes enabled
https://www.googleapis.com/auth/documents
https://www.googleapis.com/auth/drive
https://www.googleapis.com/auth/script.external_request
- If the url (of image) present in the excel sheet is wrong, then the pdf will be generated with a sample image in its place.
- If the data is not found for a template it'll write
No data found
in the template. - The temporary document in the drive will be deleted once its pdf is uploaded to the gcs.
- The API will queue the requests and will work for single pdf generation.
- The filename on the drive and the gcs will be generated automatically. The user can't supply a custom file name.
Note - API and user authentication is not being implemented at this point of time.
- Why use database?
- We need to store the details for all the pdf's we are generating. So, that we can process queries related to them even after 2 months of generation of pdf. Also, we are using queuing so need a database to search from.
- Database to be used - plsql
- Database schema -
uuid, google doc url (returned by the app script) , pdf name, current status (waiting, processing, complete), GCS URL
- What will the API return?
- It will return the final url of the pdf uploaded to GCS or the current status.
- How will you get the pdf url if it is 300th request in 1 minute?
- The API will implement queuing and in case if it has a lot of requests then it'll return response "
request in queue
"
- The API will implement queuing and in case if it has a lot of requests then it'll return response "
- How will you check if the pdf is generated for a particular request?
- It will have a module that will check the status of your request by providing the
uuid
. If the pdf is generated and uploaded to GCS then it'll return the url of uploaded PDF.
- It will have a module that will check the status of your request by providing the
- When the template placeholder (
<<19>>
) for an image/text is not found, what will it do.- If the placeholder for an image/text is not found then no error or exception will be thrown, a sample image will be placed and
No data found
will be written in case of text. The PDF will still be generated using the provided data and arguments.
- If the placeholder for an image/text is not found then no error or exception will be thrown, a sample image will be placed and
- What will happen if it gets so many requests that at any time there are some requests in queue and it didn't get to the module for uploading the pdfs to GCS
- It can be resolved by uploading all the pdfs after processing 10 elements in the queue.
- How many requests can be queued at a given time?
- 100 requests can be added to the queue. After that it'll return queue overflow.
- How will the queue process the requests?
- It will process requests in fifo order.
- What about API authentication
- The API will implement Token authentication. Although there would be no user authentication at this point.
- Write content of body of one document to other document and then saving it.
- Modify urls to actual ones. Send authentication tokens along with them to fetch the image.
- Fetch the pdf url from instance id.
- Store things in database.
- Implement queue model. and after processing 10 items from the queue upload them to gcs.
-
Getting all the data and checking if it is in proper format (It should include uuid as that will be used for checking the status of the project).
-
Fetching data from google spreadsheets on the basis of the data received on the API.
-
Variable mapping, including mapping the options.
-
Correcting and Checking the urls.
4.1. Modify urls to actual ones.
-
Generating the App Script url for the received data.
-
Queuing the data to be sent to the App script.
-
Appscript processing the data and generating a new file.
7.1. Send authentication tokens along with them to fetch the image. 7.2. Write content of body of one document to other document and then saving it.
-
Downloading the pdf when generated successfully.
-
Uploading the pdf to GCS and getting a return url
-
Adding the contents to the database.
-
Creating a module to check if the pdf is generated for a particular uuid. Returning the status of request or the final url of the uploaded pdf. It will check the status of process in queue and then in the database and will return the appropriate response.
Module | Comments |
---|---|
1 | New module. Doesn't exist already. |
2 | Modifying the existing module. |
3 | 20 mins for mapping the variables. 50 mins (hopefully) for mapping the options. |
4 | 10 mins for modifying the urls. |
5 | It'll be generated on the basis of the collected data. |
6 | I've never implemented anything like this before. So, considering the time on the basis of that fact. |
7.1 | I have to fetch the images after doing authentication. Not sure how to do it. |
7.2 | We tried various things, nothing worked. Have to find a new efficient method to do it. If nothing worked then we can implement the currently implemented method. |
8 | will implement threading here. |
9 | will do it using threading. |
10 | Create all the required modules related to databases. |
11 | Have to create all related modules. |
- We are not authenticating the API at this point.
- No user authentication is available.
- The user have to grant write access to admin@samagragovernance (or any other authentication account) the document template.
- The project can work for various similar use cases.
- User authentication.
- Transferring the content to user drives.
the template placeholders and variable mapping should start from 1 and should be in increasing order without missing and item in between.