Skip to content

[BEAM-1903, BEAM-1904] Fixes SDF issues re: watermarks and stop/resume #2455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 8, 2017

Conversation

jkff
Copy link
Contributor

@jkff jkff commented Apr 6, 2017

See detailed discussion in document:
https://docs.google.com/document/d/1BGc8pM1GOvZhwR9SARSVte-20XEoBUxrGJ5gTWXdv3c/edit

In particular, this PR:

  • Removes DoFn.ProcessContinuation and all the code and tests that handled returning resume() - from DoFnSignatures, DoFnInvokers, and SplittableParDo and runner hooks, as well as inference of UnboundedPerElement from the return value.
  • Adds ProcessContext.updateWatermark(), currently allowed to be called only from SDF. This has surprisingly little impact - basically just a "throw unsupported" implementation in SimpleDoFnRunner and DoFnTester, and a proper implementation in SDF code
  • Adds RestrictionTracker.checkDone() and unit tests

Overall this looks much easier of a change than I thought, and should be pretty easy to review. 90% of the code is for bullet 1; bullet 2 is not really separable from it so it's in the same commit. Bullet 3 could be separated; I can put it in a separate commit if needed, but it's a small amount of code.

R: @tgroh

@jkff
Copy link
Contributor Author

jkff commented Apr 6, 2017

Run Flink ValidatesRunner

@asfbot
Copy link

asfbot commented Apr 6, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/9257/
--none--

@jkff
Copy link
Contributor Author

jkff commented Apr 6, 2017

Run Flink ValidatesRunner

@asfbot
Copy link

asfbot commented Apr 6, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/2227/
--none--

@asfbot
Copy link

asfbot commented Apr 6, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/9258/

Build result: FAILURE

[...truncated 893.52 KB...]Exit code: 1 - /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall@2/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableProcessElementInvoker.java:46: error: reference not found * Can be {@code null} only if {@link #getContinuation} specifies the call should not resume. ^/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall@2/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableProcessElementInvoker.java:49: warning - Tag @link: can't find getContinuation in org.apache.beam.runners.core.SplittableProcessElementInvoker.ResultCommand line was: /usr/local/asfpackages/java/jdk1.8.0_121/jre/../bin/javadoc @options @packagesRefer to the generated Javadoc files in '/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall@2/runners/core-java/target/apidocs' dir. at org.apache.maven.plugin.javadoc.AbstractJavadocMojo.executeJavadocCommandLine(AbstractJavadocMojo.java:5188) at org.apache.maven.plugin.javadoc.AbstractJavadocMojo.executeReport(AbstractJavadocMojo.java:2075) at org.apache.maven.plugin.javadoc.JavadocJar.execute(JavadocJar.java:188) ... 33 more2017-04-06T22:38:19.106 [ERROR] 2017-04-06T22:38:19.106 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-04-06T22:38:19.106 [ERROR] 2017-04-06T22:38:19.106 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-04-06T22:38:19.106 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException2017-04-06T22:38:19.106 [ERROR] 2017-04-06T22:38:19.106 [ERROR] After correcting the problems, you can resume the build with the command2017-04-06T22:38:19.106 [ERROR] mvn -rf :beam-runners-core-javachannel stoppedSetting status of 1e6c215 to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/9258/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install
--none--

@asfbot
Copy link

asfbot commented Apr 6, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/2228/
--none--

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 5e2cd3a on jkff:sdf-updates into ** on apache:master**.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.02%) to 69.826% when pulling 71f7afa on jkff:sdf-updates into b92032f on apache:master.

@asfbot
Copy link

asfbot commented Apr 7, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/9265/
--none--

@jkff
Copy link
Contributor Author

jkff commented Apr 7, 2017

Run Dataflow ValidatesRunner

@jkff
Copy link
Contributor Author

jkff commented Apr 7, 2017

(I had to keep DoFn.ProcessContinuation in place to preserve compatibility with Dataflow worker - but I made it empty and deprecated it, so the only value of that type passed around in the program is null. I'll actually remove it using separate cleanup PRs)

@asfbot
Copy link

asfbot commented Apr 7, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2766/

Build result: FAILURE

[...truncated 4.17 MB...]2017-04-07T07:08:04.042 [INFO] 2017-04-07T07:08:04.042 [INFO] Apache Beam :: Parent .............................. SUCCESS [ 19.613 s]2017-04-07T07:08:04.042 [INFO] Apache Beam :: SDKs :: Java :: Build Tools ......... SUCCESS [ 11.218 s]2017-04-07T07:08:04.042 [INFO] Apache Beam :: SDKs ................................ SUCCESS [ 1.997 s]2017-04-07T07:08:04.042 [INFO] Apache Beam :: SDKs :: Common ...................... SUCCESS [ 1.649 s]2017-04-07T07:08:04.042 [INFO] Apache Beam :: SDKs :: Common :: Fn API ............ SUCCESS [ 12.213 s]2017-04-07T07:08:04.042 [INFO] Apache Beam :: SDKs :: Common :: Runner API ........ SUCCESS [ 6.340 s]2017-04-07T07:08:04.042 [INFO] Apache Beam :: SDKs :: Java ........................ SUCCESS [ 4.337 s]2017-04-07T07:08:04.042 [INFO] Apache Beam :: SDKs :: Java :: Core ................ SUCCESS [01:54 min]2017-04-07T07:08:04.042 [INFO] Apache Beam :: Runners ............................. SUCCESS [ 1.613 s]2017-04-07T07:08:04.042 [INFO] Apache Beam :: Runners :: Core Java Construction ... SUCCESS [ 10.115 s]2017-04-07T07:08:04.042 [INFO] Apache Beam :: Runners :: Google Cloud Dataflow .... SUCCESS [ 01:11 h]2017-04-07T07:08:04.042 [INFO] ------------------------------------------------------------------------2017-04-07T07:08:04.042 [INFO] BUILD SUCCESS2017-04-07T07:08:04.042 [INFO] ------------------------------------------------------------------------2017-04-07T07:08:04.042 [INFO] Total time: 01:17 h2017-04-07T07:08:04.042 [INFO] Finished at: 2017-04-07T07:08:04+00:002017-04-07T07:08:04.676 [INFO] Final Memory: 215M/1706M2017-04-07T07:08:04.676 [INFO] ------------------------------------------------------------------------Waiting for Jenkins to finish collecting datachannel stoppedSending e-mails to: commits@beam.apache.orgSetting status of 71f7afa to FAILURE with url https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2766/ and message: 'Build finished. 'Using context: Jenkins: Google Cloud Dataflow Runner ValidatesRunner Tests
--none--

@jkff
Copy link
Contributor Author

jkff commented Apr 7, 2017

Um it looks like Dataflow ValidatesRunner actually passed, but Jenkins failed to collect the result or something.

https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2766/console says:
2017-04-07T07:08:04.042 [INFO] BUILD SUCCESS
and
Finished: SUCCESS

So this is ready for review.

@jkff
Copy link
Contributor Author

jkff commented Apr 7, 2017

Run Dataflow ValidatesRunner

@asfbot
Copy link

asfbot commented Apr 7, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2770/
--none--

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.04%) to 69.807% when pulling d37c99b on jkff:sdf-updates into b92032f on apache:master.

@asfbot
Copy link

asfbot commented Apr 7, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/9295/
--none--

@@ -140,7 +129,6 @@ public void testInvokeProcessElementTimeBounded() throws Exception {
public void testInvokeProcessElementVoluntaryReturn() throws Exception {
SplittableProcessElementInvoker<Integer, String, OffsetRange, OffsetRangeTracker>.Result res =
runTest(5, Duration.millis(100));
assertFalse(res.getContinuation().shouldResume());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems as though there ought be a similar "nothing left to do" assertion that can be made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what the assertNull below does.

public Instant getWatermarkHold() {
for (State state : this.inMemoryState.values()) {
if (state instanceof WatermarkHoldState) {
return ((WatermarkHoldState<?>) state).read();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we guarantee that there will only ever be a single WatermarkHoldState in a StateInternals? Where?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with TestInMemoryStateInternals which already has a function like this.

@@ -86,8 +89,12 @@ public SomeRestriction currentRestriction() {
public SomeRestriction checkpoint() {
return someRestriction;
}

@Override
public void checkDone() throws IllegalStateException {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm declared exception here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@asfgit asfgit merged commit 29c2802 into apache:master Apr 8, 2017
asfgit pushed a commit that referenced this pull request Apr 8, 2017
@jkff jkff deleted the sdf-updates branch April 8, 2017 00:35
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 69.813% when pulling 29c2802 on jkff:sdf-updates into 4a694ce on apache:master.

@asfbot
Copy link

asfbot commented Apr 8, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/9308/
--none--

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants