Skip to content

Fix/bug time series.from group dataframe 2491 #2512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

AlessiopSymplectic
Copy link
Contributor

@AlessiopSymplectic AlessiopSymplectic commented Aug 28, 2024

Checklist before merging this PR:

  • Mentioned all issues that this PR fixes or addresses.
  • Summarized the updates of this PR under Summary.
  • Added an entry under Unreleased in the Changelog.

Fixes #2491

Summary

In the function "from_grou_dataframe()" there was an instance that casted the index as a Datetime index before sorting the index. The line was removed.

A test called "test_grom_group_dataframe()" was added to test_timeseries to verify the intended functionality.

Copy link
Collaborator

@madtoinou madtoinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, minors comments about the tests to make them a bit more exhaustive & ensure that everything behave as expected.

})
ts = TimeSeries.from_group_dataframe(df, group_cols="group", time_col="time")
assert isinstance(ts[0].time_index, pd.RangeIndex)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for time made out of pd.Timestamps, just to make sure they are converted to the expected type (DatetimeIndex)?

You could also try to leverage the pytest.parametrize functionality to make the test a bit shorter.

Also, would be nice to add a case where the index (either list of int or list of Timestamps) are not ordered, to verify that the sorting of the index behave as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments and the corrections, much appreciated. There are some more tests on the group_from function inside "test_timeseries_static_covariates" already, but I agree, it's better to test and unsorted index (and datetimeindex with timestamps).

Copy link

codecov bot commented Aug 28, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 93.78%. Comparing base (1c29eec) to head (a8958ee).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
darts/timeseries.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2512      +/-   ##
==========================================
- Coverage   93.79%   93.78%   -0.02%     
==========================================
  Files         139      139              
  Lines       14740    14728      -12     
==========================================
- Hits        13826    13813      -13     
- Misses        914      915       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

AlessiopSymplectic and others added 6 commits August 29, 2024 10:49
typo in function name

Co-authored-by: madtoinou <32447896+madtoinou@users.noreply.github.com>
* Included a warning about monotonically increasing index also in the case where time_col is set (analogous to the warning in the case 'time_cole is None and df.index.is_monotonic_increasing)
* potential speed improvement by case distinction within the time column (need to check this again)
… set_index and casting DatetimeIndex when the time_col is of type datetime
…concerning "from_group_dataframe" are allocated
Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @AlessiopSymplectic 🚀
Just some last minor comments, then we're ready to merge.

Comment on lines 121 to 122
assert (ts[0].time_index == pd.RangeIndex(3)).all()
assert (ts[1].time_index == pd.RangeIndex(3)).all()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can write it like this everywhere (and remove the assert in the line above)

Suggested change
assert (ts[0].time_index == pd.RangeIndex(3)).all()
assert (ts[1].time_index == pd.RangeIndex(3)).all()
assert ts[0].time_index.equals(pd.RangeIndex(3))

Comment on lines 121 to 122
assert (ts[0].time_index == pd.RangeIndex(3)).all()
assert (ts[1].time_index == pd.RangeIndex(3)).all()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also adding a check that the values in the TimeSeries are in the expected order:
You can extract the values with ts[0].values()

Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great now, thanks a lot again @AlessiopSymplectic 🚀

@dennisbader dennisbader merged commit ed925c7 into unit8co:master Aug 30, 2024
9 checks passed
@AlessiopSymplectic AlessiopSymplectic deleted the fix/bug_TimeSeries.from_group_dataframe_2491 branch August 30, 2024 12:44
@AlessiopSymplectic AlessiopSymplectic mentioned this pull request Sep 2, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] TimeSeries.from_group_dataframe incompatible with integer timestamps
3 participants