Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]: develop C implementations for base special mathematical functions #41

Closed
5 tasks done
gunjjoshi opened this issue Mar 17, 2024 · 20 comments
Closed
5 tasks done
Labels
2024 2024 GSoC proposal. rfc Project proposal.

Comments

@gunjjoshi
Copy link
Member

gunjjoshi commented Mar 17, 2024

Full name

Gunj Joshi

University status

Yes

University name

Indian Institute of Information Technology, Kottayam, Kerala

University program

Bachelor of Technology in Computer Science and Engineering

Expected graduation

May 01, 2026

Short biography

I am currently a second-year undergraduate and will be entering my third year by the time the coding period starts. I am pursuing a B.Tech degree in Computer Science and Engineering. My technical competencies include C/C++, JavaScript, Python, AWS and several other web technologies, along with a good amount of knowledge and experience with Docker and Firebase.

My coursework includes a wide variety of subjects covering Computer Organization, Computer Networks, Data Structures, Database Management Systems, Operating Systems, Design and Analysis of Algorithms, and so on.

Some of my general interests include contributing to open-source projects, learning new technologies and developing backends for web applications and other software.

Timezone

Indian Standard Time ( IST ), UTC+05:30

Contact details

email: [email protected], [email protected]

Platform

Mac

Editor

My preferred code editor is Visual Studio Code. This is because, based on my experience so far, I find VS Code to be the smoothest and most user-friendly. Also, its merge conflict resolver and availability of a large number of extensions feels great.

Programming experience

Apart from my contributions to stdlib, my programming experience consists of a wide variety of personal as well as group projects, open-source contributions, competitive programming, and so on, covering a wide variety of technologies such as Next.js, Node.js, React.js, Tailwind CSS, C/C++, Python, and JavaScript. Some of the projects I've enjoyed working on include the following:

  • Knowledge Sharing Platform: During a Winter of Code program called FOSS Overflow 2023-24, I made significant contributions to this project using Next.js, Firebase, and AWS. My contributions to the development of the Knowledge Sharing Portal (KSP) include:
    • Migrating the codebase from Prisma to Firestore Database.
    • Integrating the File Upload and Download feature using the S3 bucket of AWS.
    • Setting up different privileges for guests, normal users, and admins.
    • Implementing file moderation features for moderating the content of uploaded files on the platform.
  • AutoJoomer: During Hacktoberfest 2023, I contributed to AutoJoomer, a Chrome extension that includes features like automatic login to Wi-Fi and the learning portal. Some of my contributions to the AutoJoomer extension include:
    • Implementing RegExp for validation of user credentials.
    • Fixing the extension interface popup overflow problem.
  • Cybersecurity Club Platform: In 2022, I worked on developing a web platform for the Cybersecurity club of my college using Next.js, Tailwind CSS, and Firebase. Its features include:
    • Registration of new club members.
    • Tracking previously held and upcoming club events.

JavaScript experience

My most recent experiences with JavaScript include my contributions to stdlib, such as adding new packages, adding C implementaions for existing mathematical functions, and refactoring some of the blas/ext/base packages to align with current project conventions. Alongside my JavaScript work at stdlib, I've also utilized JavaScript extensively during my coursework and have completed numerous projects using JavaScript and its frameworks.

One of the best aspects of JavaScript is its rich ecosystem of frameworks. It can be used for creating awesome libraries like stdlib, great frontends using React.js, robust backends using Node.js, and much more!

However, one downside of JavaScript, in my opinion, is that there isn't as much seriousness or interest among most people for it outside of web development. But this perception is gradually changing, especially with projects like stdlib proving that mathematical calculations and visualizations can be efficiently done with JavaScript.

Node.js experience

While working with various web applications and other projects, I have gained extensive experience with Node.js. Not only have I worked with its core functionalities, such as writing server-side code and handling database connections, but I have also tackled tasks like file handling and more.

C/Fortran experience

The first programming language I learned was C during high school, where I created small command-line projects. Building on this foundation, I further developed my skills during my undergraduate coursework and have since made significant contributions to stdlib, which includes adding C implementations for various mathematical functions originally written in JavaScript. Additionally, I have a basic understanding of Fortran.

Interest in stdlib

Since the time I got to know about stdlib, I was very excited to see how things are working under the hood. The whole idea of, building a mathematical library for JavaScript seemed great. Initially, I thought,

But why do we need this?

As I went deeper and deeper into the codebase, talked with the maintainers, I got my answer -

What if we want to use those complex mathematical functions in the browser ? Without relying on other Python libraries ?

And that was when, I developed a huge interest in stdlib. The core idea behind this, to have all the numpy and scipy functionalities straight into JavaScript, in our browsers, amazed me. The organization of the entire project impressed me greatly. Even contributions made in remote corners of the project can seamlessly propagate to other repositories automatically. The approach of employing JavaScript implementations for smaller, lighter functions and C implementations for larger and complex ones struck me as ingenious.

The mentors are very helpful, and so are the fellow contributors. This, in turn, makes a great community. The whole process of learning, discussing and implementing felt great to someone like me.

Version control

Yes. I have a great amount of experience with GitHub.

Contributions to stdlib

Merged Pull requests : https://github.com/stdlib-js/stdlib/pulls?q=is%3Apr+author%3Agunjjoshi+is%3Amerged+
Open Pull Requests : https://github.com/stdlib-js/stdlib/pulls/gunjjoshi
Issues : https://github.com/stdlib-js/stdlib/issues?q=is%3Aissue+author%3Agunjjoshi+

Goals

After successfully completing this project, C implementations for math/base/special functions, both for single precision and double precision, will have been added. Additionally, some additional packages, including those which involve complex numbers will also be included for both single precision and double precision. This will represent a significant step towards achieving parity with numpy and scipy. Furthermore, existing implementations that have received bug fixes in their upstream implementations will also be updated.

In addition to these enhancements, work on automation and scaffolding will be undertaken to automate specialized package generation. This will streamline the process of extending the C implementations of math/base/special packages to include strided and ndarrays encapsulated within them.

Why this project?

What excites me about this project is that we're building something for a large level and audience. Having a JavaScript library, with numpy and scipy parity, is not a small thing ! I, while working on certain projects, personally wished if we could have something like numpy or scipy, using which we can use those complex mathematical functions directly, in a web application, or in the browser, rather than relying on other Python libraries.

More specifically, if I say, about the project develop C implementations for base special mathematical functions, it gives me immense excitement to work on this. The reason being, I will be working on one of the most important things out there. This is because these C implementations are essential for optimizing the performance of our existing JavaScript functions. I believe this initiative has the potential to be a game-changer.

Along with that, the idea of working with a very supportive and insightful community of stdlib, both mentors and fellow contributors, is what excites me the most.

Qualifications

During the course of this project, I will be utilizing C, JavaScript, and Node.js. As mentioned earlier, I have acquired sufficient knowledge and hands-on experience with these technologies. Throughout my high school and undergraduate coursework, I have consistently delved deeper into the workings of C and JavaScript.

Additionally, I have developed several web applications with Node.js, which have provided me with a solid understanding of the platform.

My contributions to stdlib have further added to my experience with C, JavaScript, and Node.js. As a result of all these factors, I believe, I possess the technical know-how required for this project.

Prior art

For this project, some of the work has already been started. The C implementation for some of the math/base/special functions has already been done, as given here. This progress provides us with a foundation to build upon rather than starting entirely from scratch. The ideation of the automation and scaffolding part has also been started here, and we can soon have a final plan and implementation for that too.

Commitment

  • 1 May 2024 - 26 May 2024 : Bonding Period
  • 27 May 2024 - 27 July 2024 : 30 hours per week
  • 28 July 2024 - 26 August 2024 : 20 hours per week

This, in turn, sums up to 340 hours in total.

Apart from this, I would aim to get started with the work in the bonding period itself, to ensure that we are always a bit ahead of the intended schedule. I don't have any other commitments for the entire duration of the program, which would enable me to dedicate my whole time for this project.

Implementation Plan

  • Initially, for developing C implementations for the existing packages in math/base/special, I would be referring the following package dependency graph, to make sure that the package that I am working on, has all its dependencies set up already.

    stdlib-package-dependencies.drawio-3.pdf

  • For adding newer packages, I will be taking references from many sources, rather than relying on just one. It would be better to have a mix and match out of them, while keeping in mind other factors such as performance, implementation, etc. Some of the sources include:

  • For some of our existing packages, I will be updating their implementations, to move on from the older ones to newer implementations. Again, I would be following above mentioned sources for this.

  • I will also be addressing design considerations for some of the existing packages, so that their C implementation becomes possible, with regard to our project conventions, such as maxn, minn, etc.

  • For adding newer packages that involve complex numbers, I will be following C99.

  • Some more packages, that we are currently missing out, would have been implemented, with regard to IEEE 754.

Schedule

Assuming a 12 week schedule,

  • Community Bonding Period: During this phase, I will be focusing on implementing those functions, which do not have any dependency yet to be implemented in C. For this, I will be following the above mentioned package dependency graph.

  • Week 1: Focusing on completing the double precision C implementations of packages like rempio2, gamma, gammainv etc., which can be a blocker for a large number of packages such as sin, cos, tan, etc. This would be an extension of my work that I would have started in the bonding period.

  • Week 2: Continuing work from Week 1, I would be trying to complete the double precision C implementations of the packages aimed in the first week, and would be moving towards other packages dependent on them.

  • Week 3: By the end of this week, I would aim to have completed the double precision C implementations for most of the packages in math/base/special, if not all of them.

  • Week 4: Complete backlogs, if any, and start working on single precision C implementation for all of the packages. For this too, I will be referring the package dependency graph, and that the previously mentioned sources.

  • Week 5: During this week, I would be focusing on finishing with the C implementations, both single and double precision, for all of the packages.

  • Week 6: (midterm) : Once done with all of the packages in math/base/special, I would be adding some more, newer packages, as per suggestions by the mentors.

  • Week 7: By this week, I would be trying to complete the work from week 6, and then will be moving towards the double precision implementation of the packages involving complex numbers.

  • Week 8: Completing double precision implementations for additional packages and begin single precision implementations for them.

  • Week 9: Working on any backlog from previous weeks, along with that, I would be starting out with updating existing implementations which have had bug fixes in their upstream implementations.

  • Week 10: In this week, I would be focusing to complete updating existing implementations which have had bug fixes in their upstream implementations. Along with that, I will be modifying pre-existing implementations in math/base/special, which need to be updated in accordance with sources such as [FreeBSD](https://svnweb.freebsd.org/base/).

  • Week 11: During this week, I will be looking to finish off with all of the previous work related to adding and modifying implementations, and would try to start out with the automation and scaffolding part, which would require some discussion with the mentors prior to working, as given here. I would aim to get to some conclusion, and would try to get clarity on what should we follow.

  • Week 12: During this entire week, I will be working towards the automation and scaffolding part, and things related to it. If all that goes well, incorporating the packages in math/base/special with higher order ones, such as ndarrays and strided arrays would become somewhat easier.

  • Final Week: Finishing off with the last week’s part, along with that, working on additional things, if required, after taking suggestions and feedback from the mentors.

  • Post GSoC: I would like to continue contributing to stdlib even after the completion of GSoC, working on more projects, to make sure stdlib turns up to the same level as other popular libraries out there.

Moreover, I would be making Pull Requests after the implementation of each package, so that we don't need to have the burden of reviewing a large chunk of code at the last moment.
I do fully understand that reviewing those PR's from time to time might take a significant amount of time. During that time, I will be working with #1147 in parallel.

In total, I will be aiming for a default 12-week timeline, during which I would be working on this 350 hour project. If we are left with some more time towards the end, we can work in some more depth with strided arrays and ndarrays.

Notes:

  • The community bonding period is a 3 week period built into GSoC to help you get to know the project community and participate in project discussion. This is an opportunity for you to setup your local development environment, learn how the project's source control works, refine your project plan, read any necessary documentation, and otherwise prepare to execute on your project project proposal.
  • Usually, even week 1 deliverables include some code.
  • By week 6, you need enough done at this point for your mentor to evaluate your progress and pass you. Usually, you want to be a bit more than halfway done.
  • By week 11, you may want to "code freeze" and focus on completing any tests and/or documentation.
  • During the final week, you'll be submitting your project.

Checklist

  • I have read and understood the Code of Conduct.
  • I have read and understood the application materials found in this repository.
  • I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
  • The issue name begins with [RFC]: and succinctly describes your proposal.
  • I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.
@gunjjoshi gunjjoshi added 2024 2024 GSoC proposal. rfc Project proposal. labels Mar 17, 2024
@gunjjoshi
Copy link
Member Author

Any suggestions on this ?
cc: @kgryte @Planeshifter @Pranavchiku

@Pranavchiku
Copy link
Member

Hey @gunjjoshi , iterating over your proposal these are few things that I feel can enhance your proposal

  • Provide link to projects you mentioned in Programming Experience
  • Please mention the version control softwares you worked with
  • Your examination commitment is out of GSoC period, no? If yes, you may remove it that is fine.
  • For week 1, you have mentioned small package like round, pow. round is fine and will be wrapped soon, we already have PR for it, pow was highly tricky to bag and that will also be wrapped prior to GSoC.

Incorporate the changes and we can iterate over it again, thank you!

@gunjjoshi
Copy link
Member Author

Thanks for reviewing my proposal @Pranavchiku. I will incorporate the suggested changes.

@gunjjoshi
Copy link
Member Author

I have updated this based on your suggestions @Pranavchiku. I have restructured my timeline according to that. You can have a look now !

@kgryte
Copy link
Member

kgryte commented Mar 31, 2024

Thanks, @gunjjoshi, for sharing your draft proposal. A couple of comments:

  1. For reference implementations, we are not limited to Cephes and FreeBSD. We also draw from Golang, Boost, Julia, FDLIBM, and Slatec. In short, we mix and match based on how we evaluate ease of implementation and general accuracy/performance. For a few of our implementations, we'd likely want to swap out Cephes for something else (e.g., FreeBSD or OpenLibm), as Cephes implementations may be a bit dated at this point.
  2. You may want to spend some time mapping out the dependency graph, so that you know what order double-precision implementations need to be implemented in. E.g., rempio2 is a prerequisite for sin, cos, etc. And as each implementation will be a PR, you'll want to ensure that you plan work out such that you minimize being blocked. E.g., while rempio2 is being reviewed, maybe you're able to continue working on other implementations which are not dependent on rempio2.
  3. Some APIs will have design considerations that need to be addressed. For example, some of the gamma functions are currently variadic, but should be refactored to be non-variadic, as we won't be supporting variadic interfaces in C. This will mean also updating call sites throughout the project accordingly.
  4. For complex number base math functions, I suggest consulting C99: https://en.cppreference.com/w/c/numeric/complex. We'll want to ensure we have full coverage for those APIs.
  5. Similarly, if we are missing functions from IEEE 754, those would be good to add, as well: http://www.dsc.ufcg.edu.br/~cnum/modulos/Modulo2/IEEE754_2008.pdf. One, in particular, that comes to mind is remainder.
  6. If you are blocked, you can always work on stdlib #1147 in parallel.

@gunjjoshi
Copy link
Member Author

Thanks for the suggestions @kgryte. I will be making a dependency graph for the packages that we currently have in math/base/special, and will incorporate it in the proposal. Along with that, I will modify the proposal as per your remarks.

@gunjjoshi
Copy link
Member Author

@kgryte, some packages, such as math/base/special/cceilf, have incomplete C implementations. They do have include and src directories, but no native.js. We will be completing their implementations too, right ? Or is this intentional ?

@gunjjoshi
Copy link
Member Author

gunjjoshi commented Mar 31, 2024

I've added the following package dependency graph.

stdlib-package-dependencies.drawio-3.pdf

This graph contains all packages that currently do not have their C implementations, in math/base/special.

Are there any changes that I can make now ? Also, in the contributions section, according to you, what would be better, listing out all the contributions one by one, or just having links to all of them, as we have now ?

cc: @kgryte

@Pranavchiku
Copy link
Member

Pranavchiku commented Apr 1, 2024

Just to make things easy for yourselves, you may sort the graph in topological order, this will give you a list on how you shall proceed towards implementing C functions.

Also, you may incorporate math/base/assert/* APIs.

@kgryte
Copy link
Member

kgryte commented Apr 1, 2024

@gunjjoshi That graph is nice. Thanks for generating. And agreed with @Pranavchiku on performing a topological sort. You can potentially leverage @stdlib/utils/compact-adjacency-matrix for this, which provides a toposort method. We actually also have tooling for this: https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/_tools/pkgs/toposort.

@kgryte
Copy link
Member

kgryte commented Apr 1, 2024

You should be able to use the pattern option to sort only the packages in math/base.

@gunjjoshi
Copy link
Member Author

Thanks for the topological sort idea @Pranavchiku @kgryte .
I tried to work around with https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/_tools/pkgs/toposort, but couldn't figure out how to run it. Tried running from command line, but can't figure out the exact commands. Thought to use it from npm, but there too, I couldn't find @stdlib/_tools. There is an option for command line interface usage too, but do we have some example uses or references from where I can see how can I use it ?

@kgryte
Copy link
Member

kgryte commented Apr 1, 2024

From the root stdlib directory,

node ./lib/node_modules/@stdlib/_tools/pkgs/toposort/bin/cli $PWD/lib/node_modules/@stdlib/math/base

@gunjjoshi
Copy link
Member Author

That actually worked. Thanks @kgryte.
So, based on the output, shall I list out the packages sequentially too, on which I would be working, in the proposal ? That would be a bit lengthier, but will give some clarity. Or else better to use it while actually implementing them, and including just the dependency graph in the proposal ?

@kgryte
Copy link
Member

kgryte commented Apr 1, 2024

You can use it to help guide your timeline in terms of which higher priority APIs need to be tackled first before moving on to others. In your final proposal, I think it is enough to simply reference the above command and state that you'll also be using it as a guide for what order you'll need to implement single-precision APIs.

@Pranavchiku
Copy link
Member

In your free time you may work on developing a script to add the chore files required in a package. Also, for benchmarks, tests as this is mostly copied over from existing files a script will make you go quick. Try exploring along these directions.

@gunjjoshi
Copy link
Member Author

gunjjoshi commented Apr 2, 2024

Good idea @Pranavchiku. I would be exploring the development of this script too, by referring to some of the current scripts that we currently use, such as evalpoly, evalrational, etc. Though these are not on that level which we would be requiring, but I believe references from these can be taken initially, and then we can build on top of that. I'll modify the proposal accordingly.

@Pranavchiku
Copy link
Member

You need not to do last minute changes in your proposal, it is fine if you don't have it, just keep working on these things in parallel to boost up your progress.

@kgryte
Copy link
Member

kgryte commented Apr 2, 2024

I think Pranav is referring to your own local development workflow. For example, when I work on new packages, I typically copy-and-paste an existing package which is similar to what I want (e.g., has C benchmarks, is a native add-on, etc), and then modify accordingly. Other folks may have different workflows. E.g., using a command from the terminal which copies over specific files but then scaffolds others. The general gist is that you can spend some time figuring out your preferred approach to authoring packages.

@gunjjoshi
Copy link
Member Author

Got it @kgryte @Pranavchiku. I will be figuring out what works best for me, as there would be a large number of packages, so definitely spending some time to figure out this will be fruitful.

Also, I've submitted my final proposal, thanks a lot for the suggestions helping out !

@kgryte kgryte closed this as completed Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024 2024 GSoC proposal. rfc Project proposal.
Projects
None yet
Development

No branches or pull requests

3 participants