ENH: `asarray` for Array API support #24

tupui · 2023-04-21T11:10:08Z

Add helper functions to support Array API.

I've taken some inspirations from the open PR we had on SciPy and what is done in sklearn.

namespace_from_arrays is responsible of getting the namespace. As we want to use this along asarray, it wraps array_api_compat.array_namespace which would otherwise raise for array like objects such as Python lists or scalars.

x, y = [0, 1, 2], np.arange(3)
xp = namespace_from_arrays(x, y)
xp.__name__
# 'array_api_compat.numpy'

asarray is a drop-in replacement that we could use throughout our code-base. We are using things like order and dtype, sklearn's version seems good at taking care of that and also adds copy.

x_, y_ = asarray([0, 1, 2], xp=xp), asarray(np.arange(3), xp=xp)
x_, y_
# (array([0, 1, 2]]), array([0, 1, 2]))

asarray_namespace is a first attempt at having a function which does it all. It seems to be working for now, but needs work to make it efficient. (We can also say that this is not needed at all.)

x, y, xp = asarray_namespace(x, y)
xp.__name__
# 'array_api_compat.numpy'
x, y
# (array([0, 1, 2]]), array([0, 1, 2]))

I still need to go through the RFC and add see how to handle all cases.

There might be other things to borrow from sklearn (e.g. they have _ArrayAPIWrapper, for now I would not bother with things like their _NumPyAPIWrapper as this will not get into this release and by the time we do have something here, we might just already have NumPy >= 1.22. EDIT: well might need something like that to have concat and alike.)

cc @rgommers

rgommers · 2023-04-21T11:42:11Z

Looks like a good start! I'd try adding masked array validation and the temporary global switch next - that will give a pretty good idea of the usage pattern already.

tupui · 2023-04-21T11:46:04Z

Ok I will do that 👍

asmeurer · 2023-04-25T15:53:17Z

scipy/_lib/_array_api.py

+    namespaces = set()
+    for array in arrays:
+        try:
+            namespaces.add(array_api_compat.array_namespace(array))


array_api_compat.array_namespace already accepts multiple arrays. Is the issue that there isn't a way to specify numpy as a default?

Yes I would need the function to not raise but return NumPy. This is because we want to accept things like Python lists or scalar.

I think we can add a flag to array_namespace like default=numpy that makes it do this. That function's not part of the spec so we can adjust it however makes it most useful.

That would help with our use case 😃

Quick question, should this really fail?

array_api_compat.array_namespace(np.array(1), numpy.array_api.asarray(1))

This is why I have the following to go around and convert numpy.array_api to array_api_compat.numpy

if numpy.array_api in namespaces: namespaces.remove(numpy.array_api) namespaces.add(array_api_compat.numpy)

And in general any compat code that's useful to more than one library can go in the compat library (assuming it's not too complex and pure Python).

ok I can do a PR for that if you want.

I wouldn't try mixing numpy and numpy.array_api. numpy.array_api should only be used for testing purposes. It implements a strict version of the standard so you can use it to check that you are saying within the spec. It shouldn't be used for actual user code.

The whole purpose of the compat library is to provide sufficient wrappers around numpy itself to make it array API compatible. Using numpy.array_api for user code was found to be too challenging because it uses a different array class from NumPy, which is not what most users want.

mmm ok so when I test, I should consider that numpy.array_api is something different than numpy such as lets say cupy. Makes sense to me 👍 Thanks for the explanations Aaron 😃

@asmeurer I opened data-apis/array-api-compat#39

asmeurer · 2023-04-25T15:54:12Z

I think some of this stuff can be upstreamed to array-api-compat.

…sition if needed

tupui · 2023-04-26T19:34:31Z

I added a functional global switch, a to_numpy for Cython and some error logic for masked arrays and matrix.

Next I was planning on looking at scipy.cluster and how to actually use this. Unless there is something else I would do first?

rgommers · 2023-04-26T22:24:52Z

scipy/_lib/_array_api.py

+    xp_name = xp.__name__
+
+    if xp_name in {"array_api_compat.torch", "torch"}:
+        return array.cpu().numpy()


You don't want to do this. Just using np.asarray(array) is better. As a rule, never do silent device transfers from GPU to CPU or vice verse. This is true for CuPy too - that should simply error.

This is specially useful to pass arrays to Cython.

This conversation was specifically about PyTorch CPU tensors. There may be other cpu array libraries where this works. But basically, this function doesn't seem needed, in any code where you planned to use this I think you want np.asarray instead.

Ok I will remove 👍 This was in sklearn so I thought this was more "optimal" to do the conversion using that.

Hmm interesting. @thomasjpfan I thought you preferred exceptions?

Scikit-learn has a function to do device transfer strictly for testing purposes and is not part of public API. Scikit-learn checks that the GPU implementation gives similar results as the CPU implementation.

This type of testing does not have full coverage, but it's nice to have some checks to make sure the GPU code paths does something reasonable.

Ah yes, for testing it makes sense indeed. As long as we figure out how to make sure to not accidentally introduce it into the code base outside of testing.

I added some utilities for testing back on the host from the device in my branch.. my approach was quite different though.. perhaps I can try to make a PR to this branch but...

rgommers

Thanks Pamphile. I added a few comments about the behavior, those should not be hard to address I think - overall this seems in pretty good shape already.

Next I was planning on looking at scipy.cluster and how to actually use this

Yes, that seems like a good next step.

scipy/_lib/_array_api.py

tupui

Thanks Ralf. I will make the changes and then move on to cluster 🤞

tupui · 2023-04-27T12:05:15Z

scipy/_lib/_array_api.py

+    xp_name = xp.__name__
+
+    if xp_name in {"array_api_compat.torch", "torch"}:
+        return array.cpu().numpy()


Ok I will remove 👍 This was in sklearn so I thought this was more "optimal" to do the conversion using that.

scipy/_lib/_array_api.py

asmeurer · 2023-04-27T22:21:13Z

scipy/_lib/_array_api.py

+    """
+    if xp is None:
+        xp = array_namespace(array)
+    if xp.__name__ in {"numpy", "array_api_compat.numpy", "numpy.array_api"}:


No one's actually doing it yet as far as I know, but this wouldn't work if someone vendors array_api_compat and tries to call a scipy function.

Also I don't know if it makes sense to list numpy.array_api here. That namespace is designed to only support a strict implementation of the standard, which doesn't include order.

For scikit-learn, we wanted the performance with numpy.array_api to be the same as numpy. When one explicitly sets the order, there is usually a performance reason for doing so. For example:

def scipy_func(X): xp = array_namespace(X) # switch order for performance reasons X_f = asarray(X, xp, order="F") # Do some operations that require prefer F ordered. return xp.sum(X_f, axis=0)

By the way something like

_X = numpy.asarray(..., order="F") X = numpy.array_api.asarray(X)

will also work. That's maybe a little more "spec compliant" in the sense that converting arrays from one library to another with asarray is supported. In this case it's a trivial zero-copy wrapping but in general it will use DLPack (although I don't know how DLPack handles order, so maybe someone could confirm whether this would actually work in a more general setting).

Actually that's wrong. I thought asarray in the spec used dlpack, but it's only numpy.asarray that does. In the spec you have to use from_dlpack (I'm not sure why they are separate).

tupui · 2023-05-01T19:21:35Z

What should we do with _asarray_validated? It's used 69 times throughout 6 modules.

I think the simplest would be to call asarray_namespace within _asarray_validated. It would do as asarray_namespace in case the flag is on, otherwise it could respect the current behaviour (with the addition that it would return the namespace.

Alternatively, we could add the check _asarray_validated into asarray_namespace. But that would be bulky, so probably a wrapper would look better. Still not optimal I think.

What do you think?

thomasjpfan · 2023-05-02T15:36:07Z

I think the simplest would be to call asarray_namespace within _asarray_validated.

I would go with this option to keep asarray_namespace simple. In this case, _asarray_validated needs to be updated to work with Array API. For example, np.asarray_chkfinite needs to be re-implemented using Array API.

tupui · 2023-06-13T16:57:02Z

Changes of assert_equal to assert_allclose: the latter uses a default relative tolerance of 1e-7, which is quite loose. I'd expect that if exact equality is dropped, this would use an rtol in the 1e-12 to 1e-15 range in most cases.

I just changed that because otherwise there was an issue about boolean comparison. I can try to change rtol to be stricter 👍

The if xp.__name__ in {"array_api_compat.torch", "torch"}: special-casing: is it not possible to use xp.matmul unconditionally (also, should use as a function not a method)?

Actually yes, just using xp.matmul did the trick and no specialisation is needed 😃

Ok I will make the PR now. Thank you all for helping out and see you on the other side 😉

rgommers · 2023-06-13T16:58:07Z

Ok I will make the PR now. Thank you all for helping out and see you on the other side 😉

🎉 🚀

[skip ci]

[skip cirrus] [skip circle]

This reverts commit 4787f50.

[skip cirrus] [skip circle]

[skip ci]

[skip circle] [skip cirrus]

[skip ci]

[skip cirrus] [skip circle]

tupui · 2023-06-22T14:58:11Z

Just closing this on my fork for book keeping on my side.

asmeurer reviewed Apr 25, 2023

View reviewed changes

tupui mentioned this pull request Apr 26, 2023

ENH: add fallback_namespace data-apis/array-api-compat#39

Closed

tupui added 10 commits April 26, 2023 21:28

ENH: first draft of asarray for array_api

ea9c544

ENH: add global config env variable

28a6339

TST: add test infra for USE_ARRAY_API

65aca1e

TST: add some basic test cases

912e55a

FIX: namespace simplification for NumPy like

de56fe7

MAINT: refactor namespace_from_arrays to array_namespace to ease tran…

ca5ff59

…sition if needed

FIX: consider numpy.array_api as something else than numpy

cec8bd2

ENH: add to_numpy helper

94d3044

ENH: use SCIPY_ARRAY_API in array_namespace and fallback to np

2ef5892

ENH: add a compliancy layer

ad770b8

tupui force-pushed the array_api branch from cea9ba5 to f3cebba Compare April 26, 2023 19:28

ENH: dynamic env variable.

039e931

tupui force-pushed the array_api branch from f3cebba to 039e931 Compare April 26, 2023 19:37

rgommers reviewed Apr 26, 2023

View reviewed changes

scipy/_lib/_array_api.py Show resolved Hide resolved

tupui commented Apr 27, 2023

View reviewed changes

tupui added 3 commits April 27, 2023 14:25

ENH: swap order compliance/flag and use directly xp.asarray

bb8e89c

ENH: only allow Array API arrays

c4a2c7f

DOC: add comprehensive docstrings

31a0300

asmeurer reviewed Apr 27, 2023

View reviewed changes

tupui added 2 commits May 4, 2023 18:01

ENH: add support for Array API in hierarchy

3732e1c

Merge remote-tracking branch 'upstream/main' into array_api

1ebc347

tupui added 2 commits June 13, 2023 18:53

MAINT: fix matmul specialization

bef978c

TST: add more mps skipping

60e3103

tupui mentioned this pull request Jun 13, 2023

ENH: add machinery to support Array API scipy/scipy#18668

Merged

tupui added 19 commits June 14, 2023 17:23

CI: fix pytorch version

b201851

TST/MAINT: some rtol and type adjustments

0f022a1

[skip ci]

MAINT/TST: fix some type conversions and methods to functions

226ac88

ENH: add convenient function atleast_nd

f7f2568

TST: fix astype

050a96e

MAINT: fix mypy

f2dfb3c

MAINT: add array-api-compat in the deps for now.

4787f50

[skip cirrus] [skip circle]

MAINT: remove custom isdtype in favour of xp.isdtype

a38cd2e

TST: move to_numpy helper to tests

2d12ad4

Revert "MAINT: add array-api-compat in the deps for now."

7aea9d3

This reverts commit 4787f50.

MAINT: add array-api-compat as a submodule

0e2c5b1

MAINT: add to meson in a "verbose" way

15286a6

MAINT: adjust array_api_compat imports.

92814ac

[skip cirrus] [skip circle]

MAINT: fix meson for array_api_compat tests.

b7455f2

[skip cirrus] [skip circle]

TST: fix import and collection

5744618

MAINT: ignore MyPy errors in vendored array_api_compat.

5033b54

[skip ci]

TST: add skip_if_array_api_backend

8bd49ae

[skip circle] [skip cirrus]

TST: remove some np.matrix tests

50d7ea4

[skip ci]

CI: separate step for installing torch [skip cirrus] [skip circle]

84cfe65

tupui force-pushed the array_api branch from b0f39a8 to 84cfe65 Compare June 22, 2023 11:57

tupui added 3 commits June 22, 2023 14:11

CI: fix deps definition [skip cirrus] [skip circle]

b9fe722

TST: fix string comparison of 'array_api_compat.numpy' due to submodule.

98f8309

[skip cirrus] [skip circle]

CI: ensure pytest config is used and fix some mypy.

f0c2ca4

[skip cirrus] [skip circle]

tupui closed this Jun 22, 2023

ENH: asarray for Array API support #24

ENH: asarray for Array API support #24

Uh oh!

Conversation

tupui commented Apr 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rgommers commented Apr 21, 2023

Uh oh!

tupui commented Apr 21, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asmeurer commented Apr 25, 2023

Uh oh!

tupui commented Apr 26, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rgommers left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tupui left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tupui commented May 1, 2023

Uh oh!

thomasjpfan commented May 2, 2023

Uh oh!

tupui commented Jun 13, 2023

Uh oh!

rgommers commented Jun 13, 2023

Uh oh!

tupui commented Jun 22, 2023

Uh oh!

Uh oh!

ENH: `asarray` for Array API support #24

ENH: `asarray` for Array API support #24

tupui commented Apr 21, 2023 •

edited

Loading

thomasjpfan Apr 27, 2023 •

edited

Loading