Some ideas for testing the ACE Framework #109

Vinterhav · 2023-10-14T12:08:01Z

Vinterhav
Oct 14, 2023

To learn how well a model like this works, it has to be tested.
The results from the tests should be used in a very strict feedback-loop where corrections in the models functions are made.

It could be thought of as a "neural-network-y" way of improving the model. Each change makes the model work a little bit better.
It might be a good idea to choose very different scenarios to see how well the model works in different contexts.

Outline of how tests could be classified (done now without contemplation, just to give examples)

Practical:
Order a pizza for me. I want it delivered to my door in exactly 1.5 hours from now.
I need new clothes. Find the optimal time for me to go shopping in Stockholm City. I want as little people as possible and I don't want the selection of clothes to be depleted due to seasonal changes.
Apparently there is a plan to build a city dump next to my childrens school. What can you do to stop that?

Social:
What would be the best way to convince the German population to re-start the nuclear plants?
Where would the best spot be in England to start a centre for developing AI Cognitive Frameworks?
Do what you can to make my friends like me better.

Economics, statistics:
I want to buy a piece of land to build a house. It must be no further away than 100 kms from the city centre of Madrid, and I want good environment, closeness to public transport and water.
We need an extensive forecast of inflation in Europe for the next 5 years and what can be done to lower it.

Combinations of practical and social:
You are my personal assistant. I want my friends to accept you as my own voice. What can you do to facilitate the acceptance among my frinds that you speak on my behalf?

These are just examples that are not thought through and were just thrown out there to give examples.
With a bit more focus on classification of categories and complexity, it could probably be used.

The biggest problems with doing tests is of course that the AI can't interact with society during the tests, so it is impossible to do feedback loops of how well this interection worked.
That means we are limited to socially very constrained examples of tests.

Vinterhav · 2023-10-14T12:08:46Z

Vinterhav
Oct 14, 2023
Author

Idea for testing the ACE Framework AIs in small steps.

Start with defining a three-dimensional space made of Complexity, Scope and Time, see picture "Measurable_Values_3D"

One way could be to try out a small volume (cube, where all dimensions are of equal size) and test how well the model performs.
If it works, try out a larger volume. This enlargement of the test-scope could be done in the following ways:

1 - Increase one of the three dimensions at a time, to see how well the model performs in that dimension.

2 - Test one of the dimensions at a time, where the other two dimensions are minimal, to see the effect in just this one dimension.

3 - Test all of it at the same time with super-large tasks, and compare the results with the other tests which are done in fewer dimensions.

So, if this idea sounds appealing, then remains the not-so-small-task of defining
Complexity
Scope
Time

Let's assume time is already understood, so we can focus on the remaining two.

Complexity could mean
See chart below "Complexity_model_brainstorm"

Scope could mean
How many of each complexity parameters are encountered

Note that this is just an idea that has been thrown out to illustrate the use of testing different cases. It should be reviewed and changed.
For example, why only use three dimensions?
It could be argued that the more dimensions, the harder it will be to implement, which is true, but using only three dimensions feels a bit contrived.

There are several arguments for using many dimensions, and the counter-arguments should stfu, if these many dimensions are what is needed.

0 replies

Vinterhav · 2023-10-14T12:09:38Z

Vinterhav
Oct 14, 2023
Author

0 replies

Vinterhav · 2023-10-14T12:09:52Z

Vinterhav
Oct 14, 2023
Author

0 replies

Vinterhav · 2023-10-14T12:15:05Z

Vinterhav
Oct 14, 2023
Author

Note that this is just a way to classify complexity of the tests themselves. It may not be that important exactly how these are classified, since the reason for this classification is to have a tool to separate simple test-cases from more advanced ones. Evaluating how well specific tasks were performed is probably the hardest part in all this. My view on this evaluation is that it has to be done from one of the six levels of the framework to the one above and below. In steps through all levels. But I digress, since this was an idea purely about how to classify test-cases.
Should there not be a specific section for testing?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some ideas for testing the ACE Framework #109

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Some ideas for testing the ACE Framework #109

Vinterhav Oct 14, 2023

Replies: 4 comments

Vinterhav Oct 14, 2023 Author

Vinterhav Oct 14, 2023 Author

Vinterhav Oct 14, 2023 Author

Vinterhav Oct 14, 2023 Author

Vinterhav
Oct 14, 2023

Vinterhav
Oct 14, 2023
Author

Vinterhav
Oct 14, 2023
Author

Vinterhav
Oct 14, 2023
Author

Vinterhav
Oct 14, 2023
Author